docs: complete project research

2026-03-27 00:59:24 +00:00
parent b45ae0400e
commit d9878dea73
6 changed files with 1268 additions and 0 deletions
--- a/.planning/research/.ARCHITECTURE.md.swp
+++ b/.planning/research/.ARCHITECTURE.md.swp
--- a/.planning/research/ARCHITECTURE.md
+++ b/.planning/research/ARCHITECTURE.md
@@ -0,0 +1,389 @@
 # Architecture Research
 **Domain:** Blazor WebAssembly AI Chat Application
 **Researched:** 2026-03-27
 **Confidence:** HIGH (Microsoft official docs + verified community patterns)
 ## Standard Architecture
 ### System Overview
 ```
 ┌──────────────────────────────────────────────────────────────────────┐
 │                        BROWSER (Blazor WASM)                         │
 ├──────────────────────────────────────────────────────────────────────┤
 │  ┌────────────────┐  ┌────────────────┐  ┌────────────────────────┐  │
 │  │  ChatPage      │  │  ConvList      │  │  MessageBubble         │  │
 │  │  (container)   │  │  (sidebar)     │  │  (leaf component)      │  │
 │  └───────┬────────┘  └───────┬────────┘  └────────────────────────┘  │
 │          │                  │                                         │
 │  ┌───────▼──────────────────▼─────────────────────────────────────┐  │
 │  │              ConversationStateService (Singleton DI)           │  │
 │  │   Holds: active conversation, message list, loading flag       │  │
 │  └───────┬────────────────────────────────────────────────────────┘  │
 │          │                                                            │
 │  ┌───────▼────────────────────────────────────────────────────────┐  │
 │  │                   ChatApiClient (HttpClient wrapper)           │  │
 │  │   POST /api/conversations, GET /api/stream?…, DELETE …        │  │
 │  └───────┬────────────────────────────────────────────────────────┘  │
 └──────────┼─────────────────────────────────────────────────────────┘
           │  HTTP / SSE (text/event-stream)
           │
 ┌──────────▼─────────────────────────────────────────────────────────┐
 │                   ASP.NET Core Minimal API (Server)                 │
 ├─────────────────────────────────────────────────────────────────────┤
 │  ┌──────────────────────┐   ┌──────────────────────────────────┐    │
 │  │  ChatEndpoints       │   │  ConversationEndpoints           │    │
 │  │  POST /api/chat      │   │  GET/POST/DELETE /api/…          │    │
 │  │  GET  /api/chat/…    │   │                                  │    │
 │  └──────────┬───────────┘   └───────────────┬──────────────────┘    │
 │             │                               │                        │
 │  ┌──────────▼───────────┐   ┌───────────────▼──────────────────┐    │
 │  │  OpenAiService       │   │  ConversationRepository          │    │
 │  │  (streams tokens via │   │  (reads/writes JSON files)       │    │
 │  │   openai-dotnet SDK) │   │                                  │    │
 │  └──────────┬───────────┘   └───────────────┬──────────────────┘    │
 │             │                               │                        │
 ├─────────────┼───────────────────────────────┼────────────────────────┤
 │             │ HTTPS                         │ local disk             │
 │       ┌─────▼──────┐              ┌─────────▼──────────────────┐    │
 │       │ OpenAI API │              │  ~/chat-data/              │    │
 │       │ (GPT-4o)   │              │  conversations/{id}.json   │    │
 │       └────────────┘              └────────────────────────────┘    │
 └─────────────────────────────────────────────────────────────────────┘
 ```
 ### Component Responsibilities
 | Component | Responsibility | Typical Implementation |
 |-----------|----------------|------------------------|
 | ChatPage | Top-level container — composes sidebar + chat panel, owns route | Blazor page component (`@page "/chat/{id?}"`) |
 | ConversationList | Lists saved conversations, triggers create/delete/switch | Child component with EventCallback to parent |
 | MessageList | Renders all messages in active conversation | Child component, iterates message model |
 | MessageBubble | Renders single message — user vs AI, markdown for AI | Leaf component, uses Markdig or similar |
 | ChatInput | Text area + send button, raises OnSend event | Child component with EventCallback |
 | ConversationStateService | Singleton in-memory state — active conversation, messages, streaming flag | C# service registered `AddSingleton`, raises `OnChange` events |
 | ChatApiClient | Wraps HttpClient, handles streaming plumbing | Scoped service, uses `SetBrowserResponseStreamingEnabled(true)` |
 | ChatEndpoints | Minimal API: accepts message, streams SSE response from OpenAI | Static endpoint methods wired in `Program.cs` |
 | ConversationEndpoints | Minimal API: CRUD for conversations | Static endpoint methods |
 | OpenAiService | Calls OpenAI SDK, returns `IAsyncEnumerable<string>` of tokens | Scoped service on server |
 | ConversationRepository | Read/write JSON files on disk | Singleton or Scoped service on server |
 ## Recommended Project Structure
 ```
 ChatAgentWebApp/
 ├── ChatAgentWebApp.Client/          # Blazor WASM project
 │   ├── Components/
 │   │   ├── Chat/
 │   │   │   ├── ChatPage.razor       # Page — route entry point
 │   │   │   ├── MessageList.razor    # Renders message history
 │   │   │   ├── MessageBubble.razor  # Single message (user/AI)
 │   │   │   └── ChatInput.razor      # Text input + send
 │   │   └── Conversations/
 │   │       └── ConversationList.razor  # Sidebar conversation switcher
 │   ├── Services/
 │   │   ├── ConversationStateService.cs  # In-memory singleton state
 │   │   └── ChatApiClient.cs             # HttpClient wrapper + SSE reading
 │   └── Program.cs                   # DI registration, HttpClient base URL
 │
 ├── ChatAgentWebApp.Server/          # ASP.NET Core Minimal API project
 │   ├── Endpoints/
 │   │   ├── ChatEndpoints.cs         # POST /api/chat/stream (SSE)
 │   │   └── ConversationEndpoints.cs # GET/POST/DELETE /api/conversations
 │   ├── Services/
 │   │   ├── OpenAiService.cs         # Wraps openai-dotnet SDK, yields tokens
 │   │   └── ConversationRepository.cs # JSON file read/write
 │   ├── Models/                      # Server-only models (request/response)
 │   └── Program.cs                   # Minimal API wiring, CORS, DI
 │
 └── ChatAgentWebApp.Shared/          # Shared library (both projects reference)
    └── Models/
        ├── Conversation.cs          # Shared model — id, title, createdAt
        └── ChatMessage.cs           # Shared model — role, content, timestamp
 ```
 ### Structure Rationale
 - **Client/Services/:** All HttpClient wiring and streaming logic lives here, not in components. Components stay dumb (data in via parameters, actions out via EventCallback).
 - **Shared/Models/:** Models used by both Client (display) and Server (serialization) live here. Eliminates duplicate DTOs.
 - **Server/Endpoints/:** Minimal API endpoint registration separated by concern (chat streaming vs conversation CRUD). Keeps Program.cs clean.
 - **Server/Services/:** OpenAI SDK calls and file I/O isolated from HTTP concerns. Enables testing without HTTP context.
 ## Architectural Patterns
 ### Pattern 1: SSE Streaming from Minimal API to WASM Client
 **What:** The server endpoint writes `text/event-stream` frames to the response as OpenAI tokens arrive. The WASM client reads the response as a stream using `SetBrowserResponseStreamingEnabled(true)` and processes tokens without waiting for the full response.
 **When to use:** Any time you need token-by-token streaming from an LLM. The alternative (wait for full response, then display) is noticeably worse UX for long answers.
 **Trade-offs:** Slightly more plumbing than a simple JSON response. SSE is one-directional (server to client), which is fine here — the client sends the initial message as a POST, and the stream returns the reply.
 **Server endpoint pattern:**
 ```csharp
 // Server: ChatEndpoints.cs
 app.MapPost("/api/chat/stream", async (ChatRequest request, OpenAiService ai,
    ConversationRepository repo, HttpContext http) =>
 {
    http.Response.Headers.ContentType = "text/event-stream";
    http.Response.Headers.CacheControl = "no-cache";
    await foreach (var token in ai.StreamResponseAsync(request))
    {
        await http.Response.WriteAsync($"data: {token}\n\n");
        await http.Response.Body.FlushAsync();
    }
    await http.Response.WriteAsync("event: done\ndata: end\n\n");
    await repo.AppendMessageAsync(request.ConversationId, assistantMessage);
 });
 ```
 **Client consumption pattern:**
 ```csharp
 // Client: ChatApiClient.cs
 var req = new HttpRequestMessage(HttpMethod.Post, "/api/chat/stream");
 req.SetBrowserResponseStreamingEnabled(true);  // Critical for WASM
 req.Content = JsonContent.Create(chatRequest);
 var response = await _httpClient.SendAsync(req,
    HttpCompletionOption.ResponseHeadersRead);  // Don't buffer full body
 using var stream = await response.Content.ReadAsStreamAsync();
 using var reader = new StreamReader(stream);
 while (!reader.EndOfStream)
 {
    var line = await reader.ReadLineAsync();
    if (line?.StartsWith("data: ") == true)
    {
        var token = line[6..];
        _stateService.AppendToken(token);
        await InvokeAsync(StateHasChanged);  // Update UI per token
    }
 }
 ```
 ### Pattern 2: Singleton State Service as Shared State
 **What:** A `ConversationStateService` registered as a singleton (in WASM, scoped = singleton anyway) holds the active conversation, message list, and streaming state. Components subscribe to an `OnChange` event to re-render when state updates.
 **When to use:** When multiple components need to reflect the same data — the sidebar list, the message pane, and the input box all depend on the same active conversation.
 **Trade-offs:** Simple and explicit. Not as formal as Redux/Flux but appropriate for single-user personal tool. Avoids prop-drilling through component hierarchies.
 **Example:**
 ```csharp
 // ConversationStateService.cs
 public class ConversationStateService
 {
    public Conversation? ActiveConversation { get; private set; }
    public List<ChatMessage> Messages { get; } = new();
    public bool IsStreaming { get; private set; }
    public event Action? OnChange;  // Components subscribe to this
    public void SetActive(Conversation conv)
    {
        ActiveConversation = conv;
        Messages.Clear();
        NotifyStateChanged();
    }
    public void AppendToken(string token)
    {
        // Append to last message (AI response being streamed)
        Messages.Last().Content += token;
        NotifyStateChanged();
    }
    private void NotifyStateChanged() => OnChange?.Invoke();
 }
 ```
 ### Pattern 3: Repository for JSON File Persistence
 **What:** `ConversationRepository` encapsulates all file I/O. Each conversation is stored as `{id}.json` in a configured data directory. The repository loads/saves these files and maintains an in-memory index for listing.
 **When to use:** Always — never write `File.ReadAll...` directly in endpoint handlers. Even for JSON files, the repository pattern keeps concerns separate and makes the storage medium swappable.
 **Trade-offs:** Slight overhead vs inline file I/O. But it cleanly separates persistence from HTTP handling and makes unit testing possible without touching the filesystem.
 ```csharp
 // ConversationRepository.cs
 public class ConversationRepository
 {
    private readonly string _dataDir;
    public async Task<List<Conversation>> GetAllAsync() { ... }
    public async Task<Conversation?> GetByIdAsync(string id) { ... }
    public async Task SaveAsync(Conversation conv) { ... }
    public async Task DeleteAsync(string id) { ... }
    public async Task AppendMessageAsync(string id, ChatMessage msg) { ... }
 }
 ```
 ## Data Flow
 ### Sending a Message and Receiving a Streaming Response
 ```
 [User types message + clicks Send]
         ↓
 [ChatInput.razor] → OnSend EventCallback
         ↓
 [ChatPage.razor] calls ConversationStateService.BeginStreaming()
         ↓
 [ChatApiClient.PostMessageStreamAsync()] — POST /api/chat/stream
         ↓
 [Server: ChatEndpoints] receives request
         ↓
 [OpenAiService.StreamResponseAsync()] → calls OpenAI GPT API
         ↓
 [OpenAI returns streaming response] — tokens arrive incrementally
         ↓
 [Server writes SSE frames] → "data: Hello\n\n", "data: world\n\n", ...
         ↓
 [Client StreamReader] reads lines as they arrive (ResponseHeadersRead)
         ↓
 [ConversationStateService.AppendToken()] mutates last message
         ↓
 [OnChange event fires] → all subscribed components call StateHasChanged()
         ↓
 [MessageList re-renders] — user sees tokens appearing in real time
         ↓
 [Server sends "event: done"] → client marks IsStreaming = false
         ↓
 [Server persists full response] → ConversationRepository.AppendMessageAsync()
 ```
 ### Conversation Management Flow
 ```
 [User clicks "New Conversation"]
         ↓
 [ConversationList.razor] → EventCallback to ChatPage
         ↓
 [ChatApiClient.CreateConversationAsync()] → POST /api/conversations
         ↓
 [Server: ConversationEndpoints] → ConversationRepository.SaveAsync()
         ↓
 [Returns new Conversation object]
         ↓
 [ConversationStateService.SetActive(newConv)] → clears message list
         ↓
 [All components re-render] — empty chat ready for input
 ```
 ### State Management Summary
 ```
 ConversationStateService (Singleton in WASM)
    OnChange event
         ↓ (subscribed in OnInitialized, unsubscribed in Dispose)
 [ChatPage] [MessageList] [ConversationList] [ChatInput]
         ↑
    Mutations via method calls (SetActive, AppendToken, SetStreaming)
         ↑
    Triggered by ChatApiClient responses
 ```
 ## Build Order Implications
 Components have clear dependency layers. Build from the bottom up:
 1. **Shared Models** — `Conversation`, `ChatMessage` (no deps, both projects need these)
 2. **ConversationRepository** — file I/O, no HTTP (testable in isolation)
 3. **OpenAiService** — OpenAI SDK calls, yields `IAsyncEnumerable<string>`
 4. **Server Endpoints** — wires services to HTTP (depends on 2 and 3)
 5. **ChatApiClient** — WASM HTTP client + SSE consumer (depends on server being up)
 6. **ConversationStateService** — in-memory state (depends on models, no HTTP)
 7. **Leaf UI components** — `MessageBubble`, `ChatInput`, `ConversationList` (pure display)
 8. **Container components** — `MessageList`, `ChatPage` (compose leaves, use state service)
 This order maps naturally to build phases: backend first (phases 1-3), then state layer, then UI.
 ## Scaling Considerations
 This is a single-user personal tool. Scaling is not a concern for v1.
 | Scale | Architecture Adjustment |
 |-------|--------------------------|
 | 1 user (current) | Monolith fine, JSON files fine, no auth needed |
 | Multi-user | Add auth, move to SQLite or Postgres, scope state service per user |
 | Cloud deploy | Externalize API key via Azure Key Vault, containerize server |
 ### Scaling Priorities (if needed in v2+)
 1. **First bottleneck:** JSON files don't support concurrent writes — add SQLite (EF Core migration is straightforward)
 2. **Second bottleneck:** Single server process — Blazor WASM + separate API already decoupled; scale API independently
 ## Anti-Patterns
 ### Anti-Pattern 1: Calling OpenAI from the WASM Client
 **What people do:** Register `HttpClient` in the Blazor WASM project and call `api.openai.com` directly.
 **Why it's wrong:** The OpenAI API key is visible to anyone who opens browser DevTools. The key is in the `Authorization` header of every request.
 **Do this instead:** All OpenAI calls go through the backend API. The server reads the key from `appsettings.json` or environment variables (server-side, never shipped to browser).
 ### Anti-Pattern 2: Buffering the Streaming Response
 **What people do:** Call `await response.Content.ReadAsStringAsync()` after `SendAsync`, then parse the complete response.
 **Why it's wrong:** In Blazor WASM, without `SetBrowserResponseStreamingEnabled(true)` and `ResponseHeadersRead`, the browser buffers the entire response before making any of it available. The UI shows nothing until the full AI response is complete — defeating the entire point of streaming.
 **Do this instead:** Set `SetBrowserResponseStreamingEnabled(true)` on the `HttpRequestMessage` and use `HttpCompletionOption.ResponseHeadersRead` in `SendAsync`. Read the response as a stream line by line.
 ### Anti-Pattern 3: Calling StateHasChanged from a Background Thread
 **What people do:** Mutate state and call `StateHasChanged()` directly inside async token-reading loops.
 **Why it's wrong:** In Blazor WASM, this is mostly harmless today because WASM is single-threaded — but in .NET 10+, multi-threaded WASM is becoming a reality. The correct pattern also reads more clearly.
 **Do this instead:** Use `await InvokeAsync(StateHasChanged)` when updating UI from within async callbacks or loops. This schedules re-render on the correct synchronization context and is safe across all hosting models.
 ### Anti-Pattern 4: Fat Components (Logic in Razor Files)
 **What people do:** Put API call logic, JSON deserialization, and state mutation directly in `.razor` component code blocks.
 **Why it's wrong:** The tutorial nature of this project means code must be readable. Logic buried in components is hard to explain, hard to test, and violates single responsibility. The builder is also learning Blazor patterns — fat components teach bad habits.
 **Do this instead:** Components only call services. Services own all logic. This also demonstrates the Blazor DI pattern explicitly, which is a key learning objective.
 ## Integration Points
 ### External Services
 | Service | Integration Pattern | Notes |
 |---------|---------------------|-------|
 | OpenAI GPT API | `openai-dotnet` SDK on server, `IAsyncEnumerable<string>` returned | Never from WASM client. Key in `appsettings.json` on server. |
 | Browser FileSystem | None — all persistence is server-side JSON files | WASM cannot write to local disk; server has full file access |
 ### Internal Boundaries
 | Boundary | Communication | Notes |
 |----------|---------------|-------|
 | WASM Client ↔ API Server | HTTP / SSE via `HttpClient` | Configure base URL + CORS. During development, server serves client static files (hosted model). |
 | ChatPage ↔ Child Components | Blazor parameters + EventCallback | Downward via `[Parameter]`, upward via `EventCallback<T>` |
 | Components ↔ State Service | Injected singleton, `OnChange` event subscription | Components subscribe in `OnInitialized`, unsubscribe in `IDisposable.Dispose` |
 | Endpoints ↔ Services | Constructor DI | Both `OpenAiService` and `ConversationRepository` injected into endpoint handlers |
 | OpenAiService ↔ ConversationRepository | No direct coupling | Endpoint coordinates both — calls AI service, then persists to repo |
 ## Sources
 - [Microsoft Docs: Call a web API from Blazor (aspnetcore-10.0)](https://learn.microsoft.com/en-us/aspnet/core/blazor/call-web-api?view=aspnetcore-10.0)
 - [Meziantou: Streaming an HTTP response in Blazor WebAssembly](https://www.meziantou.net/streaming-an-http-response-in-blazor-webassembly.htm)
 - [Strathweb: Built-in support for Server Sent Events in .NET 9](https://www.strathweb.com/2024/07/built-in-support-for-server-sent-events-in-net-9/)
 - [Petkir: Stream chat to your frontend with SSE in ASP.NET Core (.NET 10)](https://www.petkir.at/blog/semantic-kernel/01_chat_03_sse)
 - [openai/openai-dotnet issue #65: Streaming doesn't work properly in Blazor WASM](https://github.com/openai/openai-dotnet/issues/65)
 - [Microsoft Docs: Blazor project structure](https://learn.microsoft.com/en-us/aspnet/core/blazor/project-structure?view=aspnetcore-10.0)
 - [Microsoft Docs: Blazor state management](https://learn.microsoft.com/en-us/aspnet/core/blazor/state-management/?view=aspnetcore-10.0)
 - [Syncfusion: MVVM Pattern in Blazor For State Management](https://www.syncfusion.com/blogs/post/mvvm-pattern-blazor-state-management)
 - [PalmHill.BlazorChat — reference implementation (WASM + WebAPI + real-time LLM)](https://github.com/edgett/PalmHill.BlazorChat)
 ---
 *Architecture research for: Blazor WebAssembly AI Chat Application*
 *Researched: 2026-03-27*
--- a/.planning/research/FEATURES.md
+++ b/.planning/research/FEATURES.md
@@ -0,0 +1,202 @@
 # Feature Research
 **Domain:** Personal AI chat web application (single-user, OpenAI GPT backend, Blazor WebAssembly)
 **Researched:** 2026-03-27
 **Confidence:** HIGH (core features verified against live ChatGPT/Claude, OpenAI API docs, and Blazor ecosystem)
 ## Feature Landscape
 ### Table Stakes (Users Expect These)
 Features users assume exist. Missing these = product feels incomplete.
 | Feature | Why Expected | Complexity | Notes |
 |---------|--------------|------------|-------|
 | Send message and receive response | Core function of any AI chat app | LOW | POST to backend API; backend calls OpenAI chat completions endpoint |
 | Streaming token-by-token responses | ChatGPT normalized this; blocking responses feel broken | MEDIUM | Server-Sent Events (SSE) from backend; HttpClient streaming or SignalR on WASM client; creates "typewriter" effect |
 | Markdown rendering in AI responses | GPT always responds with markdown; raw markdown is unreadable | MEDIUM | Markdig library on backend or client; MarkupString in Blazor to render HTML; Markdown.ColorCode for syntax highlighting |
 | Syntax-highlighted code blocks | Code responses are a primary GPT use-case; unformatted code is unusable | MEDIUM | Markdown.ColorCode NuGet package; note: CsharpToColouredHTML has WASM compatibility issues — use base ColorCode package |
 | Copy-to-clipboard on code blocks | Standard expectation from ChatGPT/Claude; users paste code constantly | LOW | JavaScript interop in Blazor (`navigator.clipboard.writeText`); small JS interop call |
 | Multiple named conversations | Users need to separate topics; single-thread apps feel like a toy | MEDIUM | Conversation list in sidebar; each conversation has ID, title, message list; JSON file per conversation |
 | Create and switch between conversations | Navigation between conversations is core workflow | LOW | Once multi-conversation storage exists, switching is just loading by ID |
 | Delete conversations | Users need to clean up; no delete = clutter accumulates | LOW | Remove JSON file; update conversation list |
 | Persist conversation history across sessions | Without persistence, app is useless after refresh | MEDIUM | JSON file storage on disk; load on startup; save on every message |
 | Auto-scroll to latest message | Standard chat behavior; missing it feels broken | LOW | JavaScript interop to scroll div; or CSS scroll-behavior |
 | Loading/thinking indicator | Users need feedback that a request is in-flight | LOW | Show spinner or "..." while awaiting first token; hide once streaming starts |
 | Input disabled during response | Prevent double-submit while response is streaming | LOW | Boolean state flag; disable textarea and button while `isStreaming = true` |
 | Send on Enter key | Standard text input convention for chat | LOW | `@onkeydown` handler; Shift+Enter for newline |
 | Responsive layout | Mobile-friendly is expected even for personal tools | LOW | CSS flexbox/grid; sidebar collapses on small screens |
 ### Differentiators (Competitive Advantage)
 Features that set the product apart. Not required, but valuable.
 | Feature | Value Proposition | Complexity | Notes |
 |---------|-------------------|------------|-------|
 | Auto-generated conversation titles | Reduces naming friction; GPT can summarize first message as title | LOW | Call GPT with "Summarize this in 5 words" after first exchange; update conversation title |
 | System prompt / persona configuration | Power users want to customize GPT behavior per conversation | MEDIUM | Add `systemPrompt` field to conversation model; include as first message in API payload |
 | Message edit and regenerate | Fix typos without starting over; common in ChatGPT | MEDIUM | Truncate conversation at edited message; resend; requires re-streaming |
 | Token usage display | Helps users understand context window consumption; teaches GPT behavior | LOW | OpenAI API returns usage in response; display in footer or message metadata |
 | Conversation search | Find a past conversation by keyword | MEDIUM | Client-side search over loaded conversation titles; full-text needs indexing |
 | Export conversation | Save as markdown or text file | LOW | Serialize messages to markdown string; trigger browser download via JS interop |
 | Model selector (GPT-4o vs GPT-4o-mini) | Cost vs quality tradeoff is real; power users want control | LOW | Dropdown stored in app settings or per-conversation; passed as `model` parameter in API call |
 | Well-commented tutorial-style code | The project doubles as a Blazor learning resource | LOW (implementation cost) | Inline `// Blazor:` comments on lifecycle hooks, DI, component patterns — this is a core differentiator for this specific project's purpose |
 ### Anti-Features (Commonly Requested, Often Problematic)
 Features that seem good but create problems.
 | Feature | Why Requested | Why Problematic | Alternative |
 |---------|---------------|-----------------|-------------|
 | Authentication / login | "What if someone else uses it?" | Single-user personal tool; adds OAuth complexity with zero value | Leave open; document that it's intentionally single-user |
 | Database (SQL/SQLite) | "JSON doesn't scale" | Premature optimization for a personal tool; adds EF Core migration complexity | JSON files are fast, human-readable, and zero-setup — perfect for this scope |
 | Real-time sync across tabs | "What if I have two windows open?" | SignalR state sync complexity; no real use case for single user | Reload on focus; acceptable for personal tool |
 | Plugin / tool calling system | "GPT can call functions!" | That's LangChain/MCP territory; v2 scope — building it now adds architecture complexity before core chat works | Defer to v2 milestone with LangChain and MCP servers |
 | Voice input / output | ChatGPT has it | OpenAI Realtime API is being deprecated May 2026; adds Web Speech API complexity; out of scope | Text-only for v1 |
 | Image uploads / multimodal | GPT-4o supports it | WASM file upload + base64 encoding + vision API adds significant complexity | Text chat first; defer multimodal to v2 |
 | Conversation branching | "What if I want to explore different answers?" | Complex tree data structure; confusing UX; rare real-world use | Regenerate last response is sufficient for 95% of cases |
 | Infinite scroll / lazy loading | "What about long conversations?" | Adds virtual scrolling complexity; JSON load is fine at personal scale | Load full conversation on select; revisit if performance suffers |
 | PWA / offline support | "Make it installable" | Service worker complexity; AI chat requires internet anyway | Responsive web design is sufficient |
 ## Feature Dependencies
 ```
 [JSON File Storage]
    └──required by──> [Multiple Conversations]
                          └──required by──> [Create / Switch / Delete Conversations]
                          └──required by──> [Persist History Across Sessions]
 [OpenAI API Call (blocking)]
    └──required by──> [Streaming Responses]
                          └──required by──> [Loading Indicator]
                          └──required by──> [Input Disabled During Streaming]
 [Markdown Rendering]
    └──required by──> [Syntax-Highlighted Code Blocks]
                          └──required by──> [Copy-to-Clipboard on Code Blocks]
 [Streaming Responses] ──enhances──> [Auto-Scroll to Latest Message]
 [Multiple Conversations] ──enables──> [Auto-Generated Titles]
 [Multiple Conversations] ──enables──> [Conversation Search]
 [Multiple Conversations] ──enables──> [Export Conversation]
 [System Prompt Config] ──enhances──> [Multiple Conversations]
    (each conversation can have its own persona)
 [Token Usage Display] ──conflicts with [Streaming]
    (usage metadata only available when stream=false or in the final chunk)
 ```
 ### Dependency Notes
 - **JSON File Storage required by Multiple Conversations:** Conversations need somewhere to live before the list/switch UI can be built. Storage phase must precede conversation management phase.
 - **Blocking API call required by Streaming:** Must implement the non-streaming call first to understand the request/response shape, then layer SSE streaming on top.
 - **Markdown Rendering required by Syntax Highlighting:** ColorCode is a Markdig pipeline extension — Markdig must be wired in before syntax highlighting can be added.
 - **Token Usage Display conflicts with Streaming:** OpenAI streams individual content chunks; the `usage` field only appears in the final chunk (`finish_reason: stop`). Implementation must capture the last chunk separately.
 ## MVP Definition
 ### Launch With (v1)
 Minimum viable product — what's needed to validate the concept.
 - [ ] Send message, receive non-streaming GPT response — validates API connectivity and basic loop
 - [ ] Streaming responses — core UX differentiator; blocking responses feel unacceptable
 - [ ] Markdown rendering with syntax highlighting — GPT responses are markdown; unrendered output is unusable
 - [ ] Create / switch / delete multiple conversations — without this, the app is a single disposable thread
 - [ ] JSON file persistence — conversations must survive page refresh to be useful
 - [ ] Auto-scroll and loading indicator — baseline polish that makes the app feel complete
 - [ ] Copy-to-clipboard on code blocks — high-frequency action for developer-focused use
 - [ ] Tutorial-style inline code comments — this project's defining purpose as a learning resource
 ### Add After Validation (v1.x)
 Features to add once core is working.
 - [ ] Auto-generated conversation titles — reduces friction once the core loop is validated
 - [ ] System prompt / persona configuration — natural extension once multi-conversation is stable
 - [ ] Model selector — easy add once API layer is clean; real value for cost control
 - [ ] Export conversation — low complexity, high occasional value
 ### Future Consideration (v2+)
 Features to defer until product-market fit is established.
 - [ ] Message edit and regenerate — medium complexity; wait until core loop is solid
 - [ ] Token usage display — useful but not blocking; needs streaming completion handling
 - [ ] Conversation search — only valuable when there are many conversations to search
 - [ ] LangChain / agentic workflows — explicitly v2 scope per PROJECT.md
 - [ ] RAG document retrieval — v2 scope
 - [ ] MCP server integration — v2 scope
 ## Feature Prioritization Matrix
 | Feature | User Value | Implementation Cost | Priority |
 |---------|------------|---------------------|----------|
 | Send message / receive response | HIGH | LOW | P1 |
 | Streaming responses | HIGH | MEDIUM | P1 |
 | Markdown + syntax highlighting | HIGH | MEDIUM | P1 |
 | Multiple conversations + persistence | HIGH | MEDIUM | P1 |
 | Auto-scroll + loading indicator | HIGH | LOW | P1 |
 | Copy code to clipboard | HIGH | LOW | P1 |
 | Tutorial-style code comments | HIGH (for this project) | LOW | P1 |
 | Auto-generated conversation titles | MEDIUM | LOW | P2 |
 | System prompt configuration | MEDIUM | MEDIUM | P2 |
 | Model selector | MEDIUM | LOW | P2 |
 | Export conversation | LOW | LOW | P2 |
 | Token usage display | LOW | LOW | P2 |
 | Message edit and regenerate | MEDIUM | MEDIUM | P3 |
 | Conversation search | LOW | MEDIUM | P3 |
 **Priority key:**
 - P1: Must have for launch
 - P2: Should have, add when possible
 - P3: Nice to have, future consideration
 ## Competitor Feature Analysis
 | Feature | ChatGPT (OpenAI) | Claude (Anthropic) | This Project |
 |---------|------------------|--------------------|--------------|
 | Streaming responses | Yes, token-by-token | Yes, token-by-token | Yes — SSE via backend API |
 | Markdown rendering | Yes | Yes | Yes — Markdig + MarkupString |
 | Syntax highlighted code | Yes + copy button | Yes + copy button | Yes — Markdown.ColorCode |
 | Multiple conversations (sidebar) | Yes | Yes (Projects) | Yes — JSON file per conversation |
 | Conversation persistence | Yes (cloud) | Yes (cloud) | Yes — local JSON files |
 | Auto-generated titles | Yes | Yes | v1.x — GPT summarization call |
 | System prompt | Via custom instructions | Via system prompt | v1.x — per-conversation field |
 | Model selector | Yes (GPT-5.4 variants) | Yes (Opus/Sonnet/Haiku) | v1.x — GPT-4o vs GPT-4o-mini |
 | Voice input/output | Yes (Advanced Voice Mode) | No | Deliberately excluded from v1 |
 | Image uploads | Yes (multimodal) | Yes (multimodal) | Deliberately excluded from v1 |
 | Plugin / tool calling | Yes (via GPT Actions) | Yes (via tool use) | v2 — MCP servers |
 | RAG / document search | Yes (file attachments) | Yes (Projects + files) | v2 — RAG milestone |
 | Auth / multi-user | Yes | Yes | Deliberately excluded (single user) |
 ## Blazor-Specific Implementation Notes
 These are not features per se, but implementation constraints that affect feature complexity in Blazor WebAssembly:
 - **JavaScript Interop is required for:** clipboard access, scroll-to-bottom, syntax highlighting via client-side JS libraries (Highlight.js alternative to server-side ColorCode)
 - **API key must never reach WASM client:** all OpenAI calls must go through the ASP.NET Core backend API — the WASM client calls the backend, the backend calls OpenAI
 - **Streaming from backend to WASM client:** options are SSE (Server-Sent Events via `HttpClient` streaming) or SignalR HubConnection; SSE is simpler for one-way server-to-client streaming; SignalR is better if bidirectional messaging is needed later
 - **MarkupString in Blazor:** required to render HTML from Markdig; must be used intentionally as it bypasses Blazor's XSS protections — only render trusted content (GPT output is untrusted; sanitize or accept risk as single-user personal tool)
 - **Markdown.ColorCode WASM note:** base `Markdown.ColorCode` package works in WASM; `Markdown.ColorCode.CSharpToColoredHtml` does NOT — avoid the latter
 ## Sources
 - [ChatGPT vs Claude feature comparison 2026 — LogicWeb](https://www.logicweb.com/chatgpt-vs-claude-ultimate-ai-comparison-in-2026/)
 - [OpenAI Streaming API documentation](https://platform.openai.com/docs/api-reference/chat/streaming)
 - [OpenAI Streaming Responses Guide](https://developers.openai.com/api/docs/guides/streaming-responses)
 - [OpenAI Conversation State Guide](https://platform.openai.com/docs/guides/conversation-state)
 - [Best practices for OpenAI Chat streaming UI — Pamela Fox](http://blog.pamelafox.org/2023/09/best-practices-for-openai-chat-apps_16.html)
 - [PalmHill.BlazorChat — Blazor WASM + LLM reference implementation](https://github.com/edgett/PalmHill.BlazorChat)
 - [Blazor Live Preview Markdown with Markdig — Syncfusion](https://www.syncfusion.com/blogs/post/blazor-live-preview-markdown-editors-content-using-markdig-library)
 - [Markdown.ColorCode NuGet package](https://www.nuget.org/packages/Markdown.ColorCode)
 - [16 Chat UI Design Patterns 2025](https://bricxlabs.com/articles/message-screen-ui-deisgn)
 - [AI Chat Interface UX Patterns — UXPatterns.dev](https://uxpatterns.dev/patterns/ai-intelligence/ai-chat)
 - [Conversational AI UI Comparison 2025 — IntuitionLabs](https://intuitionlabs.ai/articles/conversational-ai-ui-comparison-2025)
 - [Token management best practices — OpenAI Community](https://community.openai.com/t/best-practices-for-cost-efficient-high-quality-context-management-in-long-ai-chats/1373996)
 ---
 *Feature research for: Personal AI Chat WebApp (Blazor WebAssembly + OpenAI GPT)*
 *Researched: 2026-03-27*
--- a/.planning/research/PITFALLS.md
+++ b/.planning/research/PITFALLS.md
@@ -0,0 +1,302 @@
 # Pitfalls Research
 **Domain:** Blazor WebAssembly AI Chat Application (OpenAI GPT, JSON storage, streaming)
 **Researched:** 2026-03-27
 **Confidence:** HIGH (multiple authoritative sources: official GitHub issues, Microsoft docs, verified community findings)
 ---
 ## Critical Pitfalls
 ### Pitfall 1: Streaming Silently Broken in Blazor WASM Without Custom Transport
 **What goes wrong:**
 OpenAI's .NET SDK streaming (`CompleteChatStreamingAsync`, returning `IAsyncEnumerable`) does not stream token-by-token in Blazor WASM. The entire response arrives at once after generation completes, making it appear like a non-streaming call. There is no error thrown — it just does not stream. This is confirmed in the official `openai-dotnet` GitHub issue tracker (#65).
 **Why it happens:**
 Blazor WASM uses a browser-based `HttpClient` backed by the Fetch API via JS interop. Response streaming requires explicitly calling `SetBrowserResponseStreamingEnabled(true)` on the underlying `HttpRequestMessage`. The OpenAI .NET SDK does not set this flag by default. Without it, the browser buffers the entire response body before exposing it to the .NET layer.
 **How to avoid:**
 Create a custom `HttpClientPipelineTransport` that overrides `OnSendingRequest` to enable browser streaming:
 ```csharp
 public class BlazorHttpClientTransport : HttpClientPipelineTransport
 {
    protected override void OnSendingRequest(
        PipelineMessage message,
        HttpRequestMessage httpRequest)
    {
        httpRequest.SetBrowserResponseStreamingEnabled(true);
    }
 }
 // Wire up at client construction:
 var options = new OpenAIClientOptions();
 options.Transport = new BlazorHttpClientTransport();
 var chatClient = new ChatClient(model: "gpt-4o", apiKey, options);
 ```
 This must be done server-side (in the backend API that proxies to OpenAI), not in the WASM client directly.
 **Warning signs:**
 - Tokens appear all at once after a delay instead of progressively
 - No console errors — it looks like it is working but is not streaming
 - Local dev works "fine" because the full response still arrives, just not incrementally
 **Phase to address:**
 Phase that introduces streaming (SSE/token-by-token rendering). Must be addressed before any streaming demo is built.
 ---
 ### Pitfall 2: OpenAI API Key Exposed in Blazor WASM Client
 **What goes wrong:**
 A developer puts the OpenAI API key in `wwwroot/appsettings.json` or reads it from `IConfiguration` in a WASM component. The key is then downloadable by any browser visitor by requesting `/_framework/blazor.boot.json` or simply navigating to `wwwroot/appsettings.json` directly. The key is burned.
 **Why it happens:**
 Experienced C# developers coming from ASP.NET Core or console apps expect `appsettings.json` and `IConfiguration` to be server-side. In Blazor WASM, `wwwroot/appsettings.json` is a static file served to the browser — it is not protected in any way. User Secrets also do not help: they are embedded into the published bundle in plaintext.
 **How to avoid:**
 The OpenAI API key must live exclusively in the backend API (ASP.NET Core Minimal API or Web API project). The WASM client calls a backend endpoint (e.g., `/api/chat`) that holds the key and proxies requests to OpenAI. The key never appears in any client-side file or JS bundle. Use `dotnet user-secrets` on the server project only.
 **Warning signs:**
 - Any `IConfiguration["OpenAI:ApiKey"]` usage in a `.razor` file or service registered in the WASM `Program.cs`
 - `appsettings.json` in `wwwroot/` containing any token, key, or secret
 - The word `sk-` visible in browser DevTools → Network → response body for any `.json` file
 **Phase to address:**
 Phase 1 (project setup / architecture foundation). This must be locked in before any OpenAI code is written.
 ---
 ### Pitfall 3: Scoped DI Services Act as Singletons in WASM — State Leaks Across Conversations
 **What goes wrong:**
 A developer registers a `ConversationService` or `ChatStateService` as `Scoped`, expecting it to reset between logical "sessions" like it would in ASP.NET Core (per-request). In Blazor WASM there is exactly one DI scope for the lifetime of the browser tab. The service never resets. All conversations accumulate state in a single object, producing corrupted cross-conversation history.
 **Why it happens:**
 In ASP.NET Core, Scoped = per HTTP request. In Blazor WASM, Scoped = per application lifetime (equivalent to Singleton). There is no shorter-lived scope unless you use `OwningComponentBase`. Developers familiar with server-side DI expect different behavior.
 **How to avoid:**
 - Understand that in WASM, `Scoped` and `Singleton` are functionally identical
 - For services that manage per-conversation state, design them to hold a collection keyed by conversation ID rather than holding mutable "current conversation" state
 - If a service must be component-scoped, inherit from `OwningComponentBase` which creates a DI scope tied to the component's lifetime
 - Never store mutable "active session" state in a scoped/singleton service; store a dictionary of `ConversationId → ConversationState`
 **Warning signs:**
 - Switching conversations causes the wrong history to appear
 - Deleting a conversation does not fully clear its state from memory
 - Services have fields like `CurrentConversation` or `ActiveMessages` rather than `Dictionary<Guid, Conversation>`
 **Phase to address:**
 Phase introducing conversation state management and multi-conversation switching.
 ---
 ### Pitfall 4: `StateHasChanged` Not Called During Token Streaming — UI Freezes Until Completion
 **What goes wrong:**
 The developer wires up streaming correctly (transport fixed, backend proxying works) but the UI does not update token-by-token. The message bubble stays empty until all tokens arrive, then the entire response appears at once. This is indistinguishable from the streaming transport bug (Pitfall 1) if not diagnosed carefully.
 **Why it happens:**
 Blazor does not automatically re-render after every `await` inside an `async` event handler or lifecycle method. When consuming an `IAsyncEnumerable<string>` (streaming tokens), the component must explicitly call `StateHasChanged()` after appending each token. Without this call, Blazor batches rendering and only repaints when the entire method completes.
 **How to avoid:**
 ```csharp
 await foreach (var token in streamingResponse)
 {
    currentMessage += token;
    StateHasChanged(); // required — Blazor will not re-render otherwise
    await Task.Yield(); // prevents UI thread starvation on rapid token delivery
 }
 ```
 Additionally, consider throttling `StateHasChanged` calls (e.g., every 50ms or every N tokens) to avoid excessive rendering if token delivery is very fast.
 **Warning signs:**
 - Streaming transport is confirmed working (via backend logs) but UI still shows nothing until complete
 - Token-by-token updates visible in server logs but not in the browser
 - Removing `await Task.Yield()` causes the browser tab to become unresponsive during streaming
 **Phase to address:**
 Streaming UI rendering phase. Document this explicitly inline in the component code.
 ---
 ### Pitfall 5: JSON File Storage Architecture Assumes Server Filesystem — Not Viable Pure Client-Side
 **What goes wrong:**
 A developer writes file I/O code (`File.ReadAllText`, `File.WriteAllText`) directly in the WASM project. The code compiles without error but throws at runtime because Blazor WASM runs in a browser sandbox with no access to the host filesystem. The virtual WASM filesystem resets on every page refresh.
 **Why it happens:**
 C# file APIs exist in the WASM .NET runtime but map to an in-memory virtual filesystem, not the OS disk. Developers coming from console or desktop C# assume `File.WriteAllText("conversations.json", json)` writes to disk. It does not — the data vanishes on refresh.
 **How to avoid:**
 JSON file storage must live in the backend API (server-side), not in the WASM client. The correct architecture:
 - WASM client calls `POST /api/conversations` → backend writes JSON to disk
 - WASM client calls `GET /api/conversations` → backend reads JSON from disk and returns it
 - Backend stores files in a configurable local path (e.g., `~/chat-data/`)
 This reinforces the same architectural boundary required for API key protection (Pitfall 2).
 **Warning signs:**
 - Any `System.IO.File` or `System.IO.Directory` usage inside the WASM project (`Client/`)
 - Conversations persist during a session but disappear on browser refresh
 - Data is present in the WASM virtual FS (`MemoryFileSystem`) but absent from the OS
 **Phase to address:**
 Phase 1 (architecture setup). The WASM/backend split must be established before any persistence code is written.
 ---
 ### Pitfall 6: IL Trimming Silently Breaks Code in Release Builds
 **What goes wrong:**
 The app works perfectly in `dotnet run` (Debug) but breaks in `dotnet publish` (Release). JSON serialization loses properties, services cannot be resolved, or features silently stop working. No exceptions in development, cryptic failures in production.
 **Why it happens:**
 Blazor WASM uses aggressive IL trimming (ILLink) during publish to reduce bundle size. The trimmer performs static analysis and removes types/methods that appear unreachable — including types used only via reflection (JSON serialization, DI, JSInterop callbacks). Debug builds do not trim.
 **How to avoid:**
 - Use `[JsonSerializable]` with `System.Text.Json` source generation for all DTO types
 - Apply `[DynamicDependency]` to methods called via reflection
 - Apply `[JSInvokable]` to all methods callable from JavaScript
 - Run `dotnet publish` early in the project (Phase 1 or 2) to detect trim warnings while the surface is small
 - Treat `<TrimmerRootDescriptor>` as a last resort, not a first step
 **Warning signs:**
 - App works in `dotnet run` but throws `NullReferenceException` or loses data after `dotnet publish`
 - JSON responses missing properties that were present in debug
 - IL trimmer warnings during publish that were ignored
 **Phase to address:**
 Phase 1 (publish pipeline verification). Also Phase covering JSON data models for conversations.
 ---
 ## Technical Debt Patterns
 | Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
 |----------|-------------------|----------------|-----------------|
 | Put API key in WASM `appsettings.json` | Faster to get OpenAI call working | Key is permanently burned; must rotate | Never |
 | Call OpenAI directly from WASM HttpClient | Eliminates backend project | API key exposed; no server-side rate limiting; blocks v2 RAG/LangChain which needs server | Never |
 | Write file I/O in WASM project | Familiar C# patterns | Silent data loss on refresh; hard to migrate later | Never |
 | All logic in `.razor` files | Faster iteration in early phases | Untestable; components become unmaintainable; hard to add v2 agent layer | Phase 1 only, refactor before Phase 3 |
 | Single `ChatService` singleton holding all state | Simple to start | State leaks across conversations; breaks multi-conversation feature | Never in this project — multi-conversation is a core requirement |
 | Skip `StateHasChanged` calls during streaming | Code is simpler | UI appears broken; streaming appears non-functional | Never |
 | Skip IL trim testing until "done" | Saves time during early phases | Trim bugs compound as codebase grows | Acceptable in Phase 1-2 if `dotnet publish` test is added to Phase 3 |
 ---
 ## Integration Gotchas
 | Integration | Common Mistake | Correct Approach |
 |-------------|----------------|-----------------|
 | OpenAI .NET SDK + Blazor WASM | Using SDK directly in WASM project without custom transport — streaming silently broken | Custom `BlazorHttpClientTransport` with `SetBrowserResponseStreamingEnabled(true)` on backend; or server-side only call |
 | OpenAI streaming via backend proxy | Backend returns full `StreamingChatCompletionUpdate` objects — client gets batched response | Backend uses `IAsyncEnumerable` with `[EnumeratorCancellation]`; streams SSE or NDJSON to client |
 | CORS between WASM client and backend API | Forgetting `AddCors` + `UseCors` on backend; 403/CORS errors during local dev | Configure CORS policy explicitly in backend `Program.cs`; scope to localhost dev origins |
 | JSON serialization of conversation models | Properties stripped by trimmer in Release; conversation history loses fields | Use `System.Text.Json` source generators with `[JsonSerializable]` for all model types |
 | Markdown rendering (AI responses) | Using `MarkupString` with raw AI output — XSS risk if AI returns script tags | Use a library like `Markdig` server-side, or a client-side library with HTML sanitization enabled |
 ---
 ## Performance Traps
 | Trap | Symptoms | Prevention | When It Breaks |
 |------|----------|------------|----------------|
 | Calling `StateHasChanged` on every token without throttling | Browser tab becomes unresponsive during fast streaming; CPU spikes | Throttle to every 50ms or every N tokens using a timer or counter | At ~20+ tokens/second (typical GPT-4o speed) |
 | Re-rendering entire conversation list on every message append | Visible flicker; full list DOM re-created on each token | Use `@key` directive on conversation list items; isolate streaming component from sidebar | At 5+ conversations in the list |
 | Loading all conversation history on app start | Slow initial load for users with many old conversations | Lazy-load conversation content; sidebar shows metadata only; load full messages on selection | At 50+ conversations stored in JSON |
 | Excessive component nesting for chat messages | Sluggish scroll performance with many messages | Keep message list in a single component with virtualization (`Virtualize`) for long histories | At 200+ messages in a single conversation |
 ---
 ## Security Mistakes
 | Mistake | Risk | Prevention |
 |---------|------|------------|
 | OpenAI API key in `wwwroot/appsettings.json` | Key visible to any browser user; unlimited API charges | Key lives only in backend server project; accessed via `dotnet user-secrets` or environment variable |
 | Calling OpenAI API directly from WASM (no backend) | Same as above; also bypasses rate limiting and logging | Mandatory backend proxy — this is enforced by the architecture from Phase 1 |
 | Rendering AI response HTML without sanitization | AI model could produce `<script>` tags; XSS attack on self | Use Markdown-to-HTML library with HTML sanitization; never use raw `MarkupString` on LLM output |
 | Storing secrets in WASM `IConfiguration` | Even runtime config values in WASM are readable from browser | All secrets in server-side `IConfiguration` only; WASM config contains only public values (API endpoint URLs) |
 | CORS wildcard (`AllowAnyOrigin`) in backend | Allows any site to call the local chat backend | Restrict CORS to `localhost` origins during development; lock down on deploy |
 ---
 ## UX Pitfalls
 | Pitfall | User Impact | Better Approach |
 |---------|-------------|-----------------|
 | No loading indicator while waiting for first streaming token | App appears frozen; user thinks it broke | Show typing indicator / spinner immediately on message send; hide when first token arrives |
 | No way to cancel an in-progress stream | User must wait for full response even if they sent the wrong message | Pass `CancellationToken` through to streaming call; expose cancel button that calls `cts.Cancel()` |
 | Auto-scroll that fights user scroll position | User scrolls up to read history; app violently scrolls back to bottom on each token | Only auto-scroll if the user is already at the bottom; detect scroll position before each `StateHasChanged` |
 | No error message when OpenAI API call fails | Empty response bubble; user does not know what happened | `try/catch` around all API calls; display inline error in message bubble with retry option |
 | Conversation list has no visual indication of active conversation | User loses track of which conversation is displayed | Highlight active conversation item; update `document.title` with conversation name |
 | Markdown rendered as raw text | AI responses with code blocks and lists look like symbol-laden garbage | Wire up Markdown renderer before any AI content is displayed — do not defer this |
 ---
 ## "Looks Done But Isn't" Checklist
 - [ ] **Streaming:** Tokens appear progressively in the UI — verify this is actual token-by-token delivery, not a batched response that arrives quickly. Check with a slow prompt.
 - [ ] **API key security:** Open DevTools → Network → find any `.json` request → confirm no `sk-` prefixed values appear in any response body.
 - [ ] **Conversation persistence:** Close and reopen the browser tab (not just refresh). Confirm conversations are still present.
 - [ ] **Multi-conversation isolation:** Open conversation A, send a message. Switch to conversation B. Verify conversation A's messages do not bleed into B.
 - [ ] **Stream cancellation:** Start a long generation. Click cancel or navigate away. Confirm the backend stops consuming tokens (check backend logs).
 - [ ] **Release build:** Run `dotnet publish` at least once before calling any phase "done." Confirm the published app loads and all features work.
 - [ ] **Error handling:** Temporarily set an invalid API key. Confirm the UI shows a user-friendly error rather than a blank component or silent failure.
 - [ ] **Markdown rendering:** Ask the AI for a code snippet. Confirm it renders in a code block, not as raw backtick-surrounded text.
 ---
 ## Recovery Strategies
 | Pitfall | Recovery Cost | Recovery Steps |
 |---------|---------------|----------------|
 | API key exposed in WASM | HIGH | Immediately rotate key in OpenAI dashboard; add `Client/` project scan to CI to block any future secret patterns |
 | All logic in `.razor` files (discovered late) | MEDIUM | Extract services incrementally — one component per PR; do not refactor all at once |
 | File I/O written in WASM project | MEDIUM | Move all persistence calls to backend API; update WASM to use `HttpClient` calls to new endpoints |
 | Streaming not working (transport not set) | LOW | Add `BlazorHttpClientTransport` wrapper — 10-line fix once identified; the hard part is diagnosing it |
 | Scoped service holding mutable conversation state | MEDIUM | Redesign service to hold `Dictionary<Guid, ConversationState>`; update all call sites |
 | IL trim breaks release build | MEDIUM–HIGH (depends on when discovered) | Add source generators for all model types; treat every trim warning as a compile error |
 ---
 ## Pitfall-to-Phase Mapping
 | Pitfall | Prevention Phase | Verification |
 |---------|------------------|--------------|
 | API key in WASM client | Phase 1: Project setup and architecture | Grep WASM project for `sk-`; confirm key only in backend `user-secrets` |
 | File I/O in WASM project | Phase 1: Project setup and architecture | Grep WASM project for `System.IO.File`; confirm persistence is backend-only |
 | Direct OpenAI call from WASM | Phase 1: Project setup and architecture | Confirm no OpenAI SDK registration in WASM `Program.cs` |
 | CORS misconfiguration | Phase 1: First HTTP call from WASM to backend | Verify browser console shows no CORS errors during local dev |
 | Scoped DI lifetime confusion | Phase covering conversation state management | Test: create two conversations; switch between them; verify history isolation |
 | Streaming transport not set | Phase introducing token-by-token streaming | Observe: tokens must appear incrementally; verify with slow prompt or network throttle |
 | StateHasChanged missing in stream loop | Phase introducing token-by-token streaming | Same as above — visible as UI not updating mid-stream |
 | UI freeze without streaming throttle | Phase polishing streaming UI | CPU profiler during active stream; verify no jank |
 | IL trimming breaks release | Phase 1 (publish test) + any phase adding new model types | Run `dotnet publish` as part of each phase completion check |
 | Markdown XSS via raw MarkupString | Phase introducing Markdown rendering | Code review: confirm no `new MarkupString(aiResponse)` without prior sanitization |
 | Auto-scroll fighting user scroll | Phase building chat message UI | Manual test: scroll up mid-stream; verify auto-scroll does not override |
 | No cancel button | Phase building streaming UI | Test: start stream, navigate away; check backend logs confirm stream terminated |
 ---
 ## Sources
 - [openai/openai-dotnet Issue #65 — Streaming doesn't work properly in Blazor WASM](https://github.com/openai/openai-dotnet/issues/65) — confirmed fix with `BlazorHttpClientTransport`
 - [Microsoft Q&A — Streaming Issue with Blazor WebAssembly and Semantic Kernel and OpenAI](https://learn.microsoft.com/en-sg/answers/questions/2242618/streaming-issue-with-blazor-webassembly-and-semati)
 - [DEV Community — Real Blazor WebAssembly Production Pitfalls](https://dev.to/janhjordie/real-blazor-webassembly-production-pitfalls-3hmf) — IL trimming, JS interop, release-only failures
 - [Chandradev Blog — 10 Blazor Coding Mistakes](https://chandradev819.wordpress.com/2025/12/17/10-blazor-coding-mistakes-i-see-in-real-projects-and-how-to-avoid-them/) — logic in components, DI misuse, naming
 - [Thinktecture — Dependency Injection Scopes in Blazor](https://www.thinktecture.com/en/blazor/dependency-injection-scopes-in-blazor/) — Scoped = Singleton in WASM
 - [ASP.NET Core Blazor DI docs](https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/dependency-injection) — official lifetime guidance
 - [ASP.NET Core Blazor rendering performance best practices](https://learn.microsoft.com/en-us/aspnet/core/blazor/performance/rendering) — StateHasChanged, re-render control
 - [dotnet/aspnetcore Issue #43098 — StateHasChanged not firing with IAsyncEnumerable](https://github.com/dotnet/aspnetcore/issues/43098)
 - [Microsoft — Secure ASP.NET Core Blazor WebAssembly](https://learn.microsoft.com/en-us/aspnet/core/blazor/security/webassembly/) — API key security, no secrets in WASM
 - [DEV Community — The Missing Third Config Layer: User Secrets in Blazor WASM](https://dev.to/j_sakamoto/the-missing-third-config-layer-adding-user-secrets-to-blazor-webassembly-2a5a) — confirms user secrets are NOT secret in WASM
 - [Microsoft — Blazor WebAssembly file access Q&A](https://learn.microsoft.com/en-us/answers/questions/1290337/blazor-webassembly-get-file-access) — browser sandbox, no local disk
 - [tpeczek.com — ASP.NET Core 9 and IAsyncEnumerable — Async Streaming from Blazor WASM](https://www.tpeczek.com/2024/09/aspnet-core-9-and-iasyncenumerable.html) — correct streaming patterns
 - [dotnet/aspnetcore Issue #55982 — network error with IAsyncEnumerable streaming in WASM](https://github.com/dotnet/aspnetcore/issues/55982)
 ---
 *Pitfalls research for: Blazor WebAssembly AI Chat Application*
 *Researched: 2026-03-27*
--- a/.planning/research/STACK.md
+++ b/.planning/research/STACK.md
@@ -0,0 +1,180 @@
 # Stack Research
 **Domain:** Blazor WebAssembly AI Chat Application (.NET / C#)
 **Researched:** 2026-03-27
 **Confidence:** HIGH (core stack verified via NuGet and official Microsoft docs; version numbers confirmed via nuget.org)
 ---
 ## Recommended Stack
 ### Core Technologies
 | Technology | Version | Purpose | Why Recommended |
 |------------|---------|---------|-----------------|
 | .NET 9 SDK | 9.x (latest patch) | Runtime, tooling, SDK | LTS-adjacent, stable, .NET 10 is in preview — stay on 9 for a tutorial project targeting a stable foundation |
 | Blazor WebAssembly Standalone | .NET 9 | Client SPA running in-browser | Non-negotiable per project constraints; client-side execution with no server round-trip for UI |
 | ASP.NET Core Web API | .NET 9 | Backend proxy for OpenAI calls | Required to keep the OpenAI API key server-side; WASM cannot access secrets directly |
 | C# 13 | Included with .NET 9 | Application language | Included in .NET 9 SDK; no separate install needed |
 **Critical architecture note:** The "hosted Blazor WebAssembly" template (single `.sln` with Client + Server + Shared projects) was removed in .NET 8. In .NET 9, you create two separate projects manually: a `dotnet new blazorwasm` standalone client and a `dotnet new webapi` backend, then add them to a solution. This is the correct approach for this project.
 ### OpenAI Integration
 | Library | Version | Purpose | Why Recommended |
 |---------|---------|---------|-----------------|
 | `OpenAI` (official) | 2.9.1 | OpenAI API client with streaming | The official OpenAI-published .NET library; supports `CompleteChatStreamingAsync()` returning `AsyncCollectionResult<StreamingChatCompletionUpdate>` via `await foreach`; stable release as of 2026-03-02 |
 **Do not use** `OpenAI-DotNet` (version 8.8.8) — this is an unofficial community package with a different API surface. The official `OpenAI` package is published directly by OpenAI and is the correct choice.
 **Streaming mechanism:** The backend Web API endpoint calls `CompleteChatStreamingAsync()` and proxies chunks to the client. The WASM client uses `HttpCompletionOption.ResponseHeadersRead` with `SetBrowserResponseStreamingEnabled(true)` on the `HttpRequestMessage` to consume the streamed response. In .NET 10 streaming is enabled by default; in .NET 9 it must be explicitly opted in per-request.
 ### Markdown Rendering
 | Library | Version | Purpose | Why Recommended |
 |---------|---------|---------|-----------------|
 | `Markdig` | 1.1.1 | Parse markdown text to HTML | The de facto standard markdown processor for .NET; CommonMark-compliant, fast, extensible, targets .NET Standard 2.0 so works in WASM; used by Microsoft and Syncfusion as the underlying engine |
 **How it integrates in Blazor:** Call `Markdig.Markdown.ToHtml(content)` on the client, render the result with `@((MarkupString)htmlContent)` in a Razor component. No JS interop needed.
 ### UI Component Library
 | Library | Version | Purpose | Why Recommended |
 |---------|---------|---------|-----------------|
 | `MudBlazor` | 9.2.0 | Material Design component library | Full .NET 9 support confirmed; pure C# with minimal JavaScript; comprehensive chat-friendly components (MudTextField, MudPaper, MudScrollToBottom, MudList); large community; no per-seat licensing |
 **Alternative considered:** Radzen Blazor (free, good) and Telerik UI for Blazor (licensed). MudBlazor wins for a tutorial/personal project because it is free, has zero JS dependencies, and has excellent documentation for learners.
 ### JSON Storage (Server-side)
 | Technology | Version | Purpose | Why Recommended |
 |------------|---------|---------|-----------------|
 | `System.Text.Json` | Built into .NET 9 | Serialize/deserialize conversation history | Built-in, no extra dependency; `JsonSerializerOptions` with `WriteIndented = true` for human-readable files; async file I/O via `File.ReadAllTextAsync` / `File.WriteAllTextAsync` |
 Storage lives entirely on the **backend** (Web API project). The WASM client cannot access the local filesystem — only the server can. API endpoints expose CRUD operations over conversations, with JSON files persisted in a configurable directory on the server host.
 ---
 ## Supporting Libraries
 | Library | Version | Purpose | When to Use |
 |---------|---------|---------|-------------|
 | `Microsoft.Extensions.AI` (abstractions) | 9.x preview | Optional AI abstraction layer | Skip for v1 — adds indirection before the core chat pattern is understood. Relevant for v2 when adding multi-provider support |
 | `Blazored.LocalStorage` | latest | Browser local storage | Not needed for this project — persistence is on the server via JSON files, not the browser |
 | `System.Net.ServerSentEvents` | Built into .NET 9 | SSE parser for streaming | Used automatically by the `OpenAI` library on the server; no direct usage needed |
 ---
 ## Development Tools
 | Tool | Purpose | Notes |
 |------|---------|-------|
 | Visual Studio 2022 (v17.12+) | IDE with Blazor hot reload | Recommended for tutorial builder; full Blazor debugging, component preview, and hot reload support |
 | VS Code + C# Dev Kit | Lighter-weight alternative | Works well; use `dotnet watch` for hot reload |
 | `dotnet watch run` | Hot reload during development | Run in both Client and Server project directories simultaneously |
 | `dotnet-dev-certs` | HTTPS dev certificate | Required for local HTTPS; run `dotnet dev-certs https --trust` once |
 ---
 ## Installation
 ```bash
 # Create solution
 mkdir ChatAgentApp && cd ChatAgentApp
 dotnet new sln -n ChatAgentApp
 # Create Blazor WASM client (standalone)
 dotnet new blazorwasm -n ChatAgentApp.Client --framework net9.0
 dotnet sln add ChatAgentApp.Client/ChatAgentApp.Client.csproj
 # Create ASP.NET Core Web API backend
 dotnet new webapi -n ChatAgentApp.Api --framework net9.0
 dotnet sln add ChatAgentApp.Api/ChatAgentApp.Api.csproj
 # Install OpenAI SDK in the API project
 cd ChatAgentApp.Api
 dotnet add package OpenAI --version 2.9.1
 # Install Markdig in the Client project
 cd ../ChatAgentApp.Client
 dotnet add package Markdig --version 1.1.1
 dotnet add package MudBlazor --version 9.2.0
 ```
 ---
 ## Alternatives Considered
 | Recommended | Alternative | When to Use Alternative |
 |-------------|-------------|-------------------------|
 | `OpenAI` 2.9.1 (official) | `OpenAI-DotNet` 8.8.8 (unofficial) | Never — the official package is now stable and maintained by OpenAI directly |
 | `OpenAI` 2.9.1 (official) | `Azure.AI.OpenAI` 2.1.0 | When targeting Azure OpenAI Service specifically (e.g., enterprise, EU data residency, private endpoints) — overkill for this project |
 | `Markdig` | `CommonMark.NET` | Only if strict CommonMark compliance matters more than extensions; Markdig is a superset and the ecosystem standard |
 | `MudBlazor` | Radzen Blazor | Radzen is fine; choose it if you already know it; MudBlazor has more learning resources |
 | `MudBlazor` | Telerik UI for Blazor | Telerik requires a paid license; not appropriate for a personal tool |
 | Standalone WASM + separate Web API | Blazor Web App template (unified) | Use the unified Blazor Web App template when you want mixed Server+WASM render modes on a single project; overkill for this project and obscures the WASM-specific patterns the tutorial aims to teach |
 | JSON flat files (server-side) | SQLite via EF Core | SQLite is a better choice at scale; JSON is simpler for single-user personal tools and avoids introducing a migration workflow |
 ---
 ## What NOT to Use
 | Avoid | Why | Use Instead |
 |-------|-----|-------------|
 | `OpenAI-DotNet` (unofficial) | Different API surface, not maintained by OpenAI, version numbers create confusion | Official `OpenAI` NuGet package |
 | `Microsoft.SemanticKernel` | Adds significant abstraction and dependency weight for a tutorial; streaming works but is complex to explain | Direct `OpenAI` SDK calls; add SK in v2 when orchestration is needed |
 | JavaScript `EventSource` API via JSInterop for streaming | Blazor WASM has `SetBrowserResponseStreamingEnabled` which avoids JS interop; adding JSInterop for streaming increases complexity significantly | `HttpCompletionOption.ResponseHeadersRead` + `SetBrowserResponseStreamingEnabled(true)` in the HTTP handler |
 | `Newtonsoft.Json` | Unnecessary dependency; `System.Text.Json` is built into .NET 9 and is faster; Newtonsoft was the pre-.NET Core standard | `System.Text.Json` (built-in) |
 | `Blazored.LocalStorage` for persistence | Browser storage is limited (~5MB), cleared by users, and not suitable for chat history of any meaningful length; also exposes all data client-side | Server-side JSON file storage via the Web API |
 | AOT compilation during learning phase | Dramatically increases build times; not needed until production optimization is a concern; confusing to introduce in a tutorial | Default IL interpretation; add AOT opt-in note in the final phase |
 ---
 ## Stack Patterns by Variant
 **For streaming responses from the API backend to the WASM client:**
 - Backend streams OpenAI tokens as `text/event-stream` (SSE) or `application/x-ndjson`
 - Client uses `SetBrowserResponseStreamingEnabled(true)` on `HttpRequestMessage`
 - Client reads with `HttpCompletionOption.ResponseHeadersRead` and iterates the stream
 - Trigger `StateHasChanged()` in the component after each token to update the UI
 **For local JSON file storage on the server:**
 - Define a `ConversationRepository` service on the API that reads/writes from a configurable base path
 - Register as `Singleton` (not `Scoped`) since there is only one user and file access must be serialized
 - Use `SemaphoreSlim(1,1)` to prevent concurrent write conflicts even in single-user mode
 **For markdown rendering in the client:**
 - Use `Markdig.Markdown.ToHtml(text, pipeline)` where `pipeline` is built with `MarkdownPipelineBuilder` enabling extensions (e.g., `UseAutoLinks()`, `UseEmojiAndSmiley()`)
 - Render the HTML string using `@((MarkupString)html)` inside a `<div class="markdown-body">` element
 - Apply CSS (GitHub Markdown CSS or custom) scoped to `.markdown-body` for code blocks and tables
 ---
 ## Version Compatibility
 | Package | Compatible With | Notes |
 |---------|-----------------|-------|
 | `OpenAI` 2.9.1 | .NET Standard 2.0+ (.NET 9 confirmed) | Published 2026-03-02; requires `System.Net.ServerSentEvents` (built into .NET 9) |
 | `Markdig` 1.1.1 | .NET 8.0, .NET Standard 2.0, .NET Framework 4.6.2 | .NET 9 compatible via .NET 8 TFM; published 2026-03-04 |
 | `MudBlazor` 9.2.0 | .NET 8.0, .NET 9.0, .NET 10.0 | Published 2026-03-18; version 9.x = full support for .NET 9 |
 | .NET 9 SDK | Blazor WASM + Web API in same solution | Both project types target `net9.0`; no cross-framework issues |
 ---
 ## Sources
 - https://www.nuget.org/packages/OpenAI — Official OpenAI NuGet package; version 2.9.1 confirmed (2026-03-02)
 - https://github.com/openai/openai-dotnet — Official OpenAI .NET SDK; streaming API verified (`CompleteChatStreamingAsync`, `await foreach`)
 - https://www.nuget.org/packages/Markdig — Markdig version 1.1.1 confirmed (2026-03-04)
 - https://www.nuget.org/packages/MudBlazor — MudBlazor 9.2.0 confirmed; .NET 8/9/10 full support (2026-03-18)
 - https://learn.microsoft.com/en-us/aspnet/core/blazor/hosting-models?view=aspnetcore-9.0 — Official Blazor hosting model docs; standalone WASM vs Blazor Web App distinction verified
 - https://learn.microsoft.com/en-us/dotnet/core/compatibility/networking/10.0/default-http-streaming — Breaking change: WASM streaming opt-in (.NET 9) vs default (.NET 10)
 - https://www.strathweb.com/2024/07/built-in-support-for-server-sent-events-in-net-9/ — SSE native support in .NET 9 via `System.Net.ServerSentEvents`; used internally by OpenAI SDK (MEDIUM confidence, single source)
 - https://github.com/openai/openai-dotnet/issues/65 — Confirmed streaming issue in Blazor WASM requires `SetBrowserResponseStreamingEnabled(true)` (MEDIUM confidence, GitHub issue thread)
 - https://devblogs.microsoft.com/dotnet/openai-dotnet-library/ — Official .NET Blog announcement of the OpenAI library
 - https://dev.to/kazinix/blazor-web-app-webassembly-hosted-in-net8-and-net9-1k6g — Hosted template removal in .NET 8+, manual solution structure (MEDIUM confidence)
 ---
 *Stack research for: Blazor WebAssembly AI Chat Application*
 *Researched: 2026-03-27*
--- a/.planning/research/SUMMARY.md
+++ b/.planning/research/SUMMARY.md
@@ -0,0 +1,195 @@
 # Project Research Summary
 **Project:** Blazor WebAssembly AI Chat Application
 **Domain:** Single-user personal AI chat web app (.NET / C# / OpenAI GPT)
 **Researched:** 2026-03-27
 **Confidence:** HIGH
 ## Executive Summary
 This is a single-user personal AI chat application built on Blazor WebAssembly with an ASP.NET Core backend. The project has a dual purpose: functioning as a useful personal tool and serving as a tutorial-quality reference implementation for Blazor WASM patterns. The recommended architecture is a strict two-project split — a standalone Blazor WASM client and a separate ASP.NET Core Minimal API server — reflecting a breaking change in .NET 8+ that removed the hosted Blazor WASM template. The client runs entirely in the browser; the server holds secrets, calls OpenAI, and manages disk persistence. This boundary is non-negotiable and must be established before any feature code is written.
 The core technical challenge is streaming. OpenAI's token-by-token streaming requires explicit opt-in in Blazor WASM (`SetBrowserResponseStreamingEnabled(true)`) that the SDK does not set by default — the stream silently falls back to buffered delivery with no error. Combined with the need to call `StateHasChanged()` on every token to update the UI, streaming is the highest-risk implementation step and must be validated early. All other features — conversation management, markdown rendering, copy-to-clipboard — are well-understood patterns with clear .NET implementations.
 The key risk profile is concentrated in Phase 1 (architecture foundation) and the streaming phase. Three "never" mistakes — putting the API key in WASM, writing file I/O in the WASM project, and calling OpenAI directly from WASM — must be locked out architecturally before feature development begins. Once those boundaries are established, the remainder of the v1 feature set follows a clear dependency chain from storage to conversations to streaming to UI polish.
 ## Key Findings
 ### Recommended Stack
 The stack is .NET 9 throughout: a `blazorwasm` standalone client and a `webapi` backend in a single solution, connected by HTTP and SSE. There are no exotic dependencies — the official `OpenAI` NuGet package (2.9.1, published by OpenAI directly) handles AI calls, `Markdig` (1.1.1) handles markdown-to-HTML conversion, `MudBlazor` (9.2.0) provides Material Design UI components with zero JavaScript dependencies, and `System.Text.Json` (built-in) handles JSON serialization and file storage. All versions are confirmed compatible with .NET 9.
 The most important stack decision is what to exclude: do not use `OpenAI-DotNet` (unofficial community package), `Microsoft.SemanticKernel` (excessive abstraction for v1), `Newtonsoft.Json` (superseded by System.Text.Json), `Blazored.LocalStorage` (wrong persistence layer for this architecture), or JSInterop for streaming (WASM has a native streaming opt-in that avoids it).
 **Core technologies:**
 - `.NET 9 SDK + C# 13`: Runtime and language — stable, LTS-adjacent, both project types target `net9.0`
 - `Blazor WebAssembly (standalone)`: Client SPA — non-negotiable per project constraints; runs in-browser with no server round-trip for UI
 - `ASP.NET Core Minimal API`: Backend proxy — required to keep the OpenAI key server-side and to handle disk I/O that WASM cannot perform
 - `OpenAI` 2.9.1 (official): AI calls — `CompleteChatStreamingAsync()` with `await foreach`; the only correct .NET SDK choice
 - `Markdig` 1.1.1: Markdown rendering — de facto .NET standard; CommonMark-compliant; renders via `@((MarkupString)html)` with no JS interop
 - `MudBlazor` 9.2.0: UI components — pure C#, zero JS dependencies, comprehensive chat-friendly components, free license
 - `System.Text.Json` (built-in): Persistence — serialize conversations to JSON files on the server; no extra dependency
 ### Expected Features
 The feature set is well-defined by comparison to ChatGPT and Claude as reference products. The v1 scope is intentionally constrained to what makes the app genuinely usable, with explicit anti-features documented to prevent scope creep during implementation.
 **Must have (table stakes):**
 - Send message / receive streaming response — core loop; blocking responses are unacceptable by 2026 standards
 - Markdown rendering with syntax-highlighted code blocks — GPT always responds with markdown; raw output is unusable
 - Multiple named conversations with create / switch / delete — without this, the app is a single disposable thread
 - JSON file persistence across sessions — conversations must survive page refresh to be useful
 - Auto-scroll to latest message and loading indicator — baseline polish that makes the app feel complete
 - Copy-to-clipboard on code blocks — high-frequency action for the developer-focused target user
 - Input disabled during streaming and send-on-Enter — prevents double-submit and matches chat conventions
 - Tutorial-style inline code comments — the project's defining purpose as a learning resource
 **Should have (competitive, v1.x):**
 - Auto-generated conversation titles — reduces naming friction; single GPT summarization call
 - System prompt / persona configuration — power-user feature; natural extension once multi-conversation works
 - Model selector (GPT-4o vs GPT-4o-mini) — cost/quality tradeoff; low implementation cost
 - Export conversation to markdown/text — low complexity, occasional high value
 **Defer (v2+):**
 - Message edit and regenerate — medium complexity; wait until core loop is solid
 - Token usage display — streaming completion handling required; not blocking
 - LangChain / agentic workflows, RAG, MCP server integration — explicitly v2 per project intent
 - Voice input/output, image uploads, multi-user auth, PWA — all documented anti-features with clear rationale
 Feature dependencies are explicit: JSON storage must precede conversation management, which must precede conversation switching. A basic blocking API call must precede streaming. Markdown must precede syntax highlighting, which must precede copy-to-clipboard.
 ### Architecture Approach
 The architecture is a strict two-tier system: a Blazor WASM SPA in the browser communicating with an ASP.NET Core Minimal API via HTTP and SSE. State on the client is managed by a singleton `ConversationStateService` that raises `OnChange` events — components subscribe in `OnInitialized` and unsubscribe in `Dispose`. There is a `Shared` library project that holds `Conversation` and `ChatMessage` models used by both tiers, eliminating duplicate DTOs.
 Components are kept intentionally thin (data in via `[Parameter]`, actions out via `EventCallback<T>`). All logic lives in services. This is explicitly stated as a tutorial goal — fat components teach bad habits and are hard to explain.
 **Major components:**
 1. `ConversationStateService` (WASM singleton) — active conversation, message list, streaming flag; raises `OnChange` for all subscribed components
 2. `ChatApiClient` (WASM scoped service) — wraps `HttpClient`, handles SSE stream reading with `SetBrowserResponseStreamingEnabled(true)` and `ResponseHeadersRead`
 3. `OpenAiService` (server scoped) — wraps official OpenAI SDK, returns `IAsyncEnumerable<string>` of tokens to endpoint handlers
 4. `ConversationRepository` (server singleton) — reads/writes JSON files under a configurable data directory; uses `SemaphoreSlim(1,1)` for write serialization
 5. `ChatEndpoints` + `ConversationEndpoints` (server Minimal API) — thin HTTP layer wiring services to routes; SSE streaming endpoint proxies tokens to client
 6. Leaf UI components: `MessageBubble`, `ChatInput`, `ConversationList`, `MessageList` — pure display, no service calls
 7. Container component: `ChatPage` — composes all child components, owns the route (`@page "/chat/{id?}"`)
 **Build order:** Shared models → `ConversationRepository` → `OpenAiService` → server endpoints → `ChatApiClient` → `ConversationStateService` → leaf UI → container UI. This maps directly to implementation phases.
 ### Critical Pitfalls
 1. **Streaming silently broken in WASM (Pitfall 1 + 4)** — Two distinct failure modes that appear identical: (a) the OpenAI SDK does not set `SetBrowserResponseStreamingEnabled(true)` so the browser buffers the entire response; (b) `StateHasChanged()` is not called per-token so Blazor batches all renders until the stream completes. Both produce the same symptom — tokens appear all at once. Fix: custom `BlazorHttpClientTransport` on the backend, and explicit `StateHasChanged()` + `await Task.Yield()` inside the `await foreach` token loop. Throttle to ~50ms intervals to prevent UI thread starvation at GPT-4o token speeds.
 2. **API key exposure in WASM (Pitfall 2)** — `wwwroot/appsettings.json` is a static file served to any browser visitor. `dotnet user-secrets` in WASM projects are embedded in the published bundle in plaintext. The key must live exclusively in the server project, accessed via server-side `user-secrets` or environment variables. This boundary must be established in Phase 1 and never crossed.
 3. **File I/O in WASM project (Pitfall 5)** — `System.IO.File` compiles in WASM but writes to an in-memory virtual filesystem that resets on every page refresh. All persistence must go through backend API endpoints. Reinforce the same architectural boundary as the API key rule.
 4. **Scoped DI = Singleton in WASM (Pitfall 3)** — In Blazor WASM there is exactly one DI scope for the tab lifetime. A service registered as `Scoped` never resets. Design `ConversationStateService` to hold a collection keyed by conversation ID, not mutable "current conversation" fields.
 5. **IL trimming breaks Release builds (Pitfall 6)** — Debug builds do not trim; published builds do. JSON serialization properties, DI-resolved types, and JSInterop callbacks can be silently stripped. Use `[JsonSerializable]` source generators on all model types and run `dotnet publish` once in Phase 1 to catch trim warnings while the surface is small.
 ## Implications for Roadmap
 Based on combined research, the architecture dependency chain and pitfall prevention requirements suggest five phases:
 ### Phase 1: Architecture Foundation
 **Rationale:** Three critical "never" mistakes (API key in WASM, file I/O in WASM, direct OpenAI call from WASM) must be architecturally locked before any feature code is written. The WASM/backend split is the load-bearing constraint everything else depends on. This phase also establishes the `Shared` models library which both tiers need immediately.
 **Delivers:** Working solution structure with two projects + shared library; CORS configured; basic HTTP connectivity verified WASM-to-server; `dotnet publish` tested once to catch IL trim warnings early; placeholder endpoints in place; no OpenAI calls yet.
 **Addresses:** Project scaffolding, solution structure (FEATURES.md scaffolding prerequisite)
 **Avoids:** API key exposure (Pitfall 2), file I/O in WASM (Pitfall 5), direct OpenAI calls from WASM (Architecture anti-pattern 1), IL trimming surprises (Pitfall 6)
 ### Phase 2: Conversation Storage and Management
 **Rationale:** JSON file storage is the prerequisite for every conversation-related feature. Per the feature dependency graph: `[JSON File Storage] → [Multiple Conversations] → [Create/Switch/Delete/Persist]`. This phase must come before any AI integration because the persistence layer needs to exist before we can store AI responses.
 **Delivers:** `ConversationRepository` with full CRUD, `ConversationEndpoints` wired to HTTP routes, `ConversationList` sidebar component, create/switch/delete conversations working, conversation history persisted to disk and loaded on startup. The app has no AI yet but has a working conversation management UI.
 **Uses:** `System.Text.Json` built-in, `SemaphoreSlim(1,1)` for write serialization, `MudBlazor` for sidebar components
 **Implements:** `ConversationRepository`, `ConversationEndpoints`, `ConversationStateService` (initial version), `ConversationList.razor`
 **Avoids:** Scoped DI state leaks (Pitfall 3) — design `ConversationStateService` with `Dictionary<Guid, ConversationState>` from the start
 ### Phase 3: Basic AI Chat (Non-Streaming)
 **Rationale:** Per the feature dependency chain, a working blocking API call must be established before streaming is layered on top. Building non-streaming first validates the full request/response shape, CORS, error handling, and conversation history construction without the added complexity of SSE. This is the correct learning sequence for a tutorial project.
 **Delivers:** Full chat loop working end-to-end: user sends message → backend calls OpenAI → response appended to conversation → conversation saved to disk. All without streaming. Markdown rendering added here because GPT responses with raw markdown are effectively unusable and would make all testing painful.
 **Uses:** `OpenAI` 2.9.1 SDK, `Markdig` 1.1.1, `MudBlazor` chat components
 **Implements:** `OpenAiService`, `ChatEndpoints` (non-streaming POST), `ChatApiClient` (basic POST), `MessageBubble.razor` with `@((MarkupString)html)` rendering, `ChatInput.razor`
 **Avoids:** Markdown XSS via raw `MarkupString` (PITFALLS integration gotchas) — sanitize or accept risk explicitly in code comments
 ### Phase 4: Streaming Responses
 **Rationale:** Streaming is the highest-risk implementation step. Research identified two independent failure modes (transport not set, `StateHasChanged` not called) that produce identical symptoms. Addressing this in its own phase means streaming can be diagnosed and debugged in isolation, without other variables. All streaming-specific patterns — `BlazorHttpClientTransport`, SSE endpoint, `ResponseHeadersRead`, per-token `StateHasChanged` with throttling — are introduced and documented here.
 **Delivers:** Token-by-token streaming from OpenAI through the backend SSE endpoint to the WASM UI. Loading indicator shown immediately on send, hidden on first token. Auto-scroll to latest message. Input disabled during streaming. Cancel button wired to `CancellationToken`. Stream throttling (~50ms) to prevent UI thread starvation.
 **Uses:** `SetBrowserResponseStreamingEnabled(true)`, `HttpCompletionOption.ResponseHeadersRead`, `text/event-stream` SSE frames, `await Task.Yield()` in token loop
 **Implements:** Streaming `ChatEndpoints`, updated `ChatApiClient` with stream reader, updated `MessageList` and `ChatPage` with streaming state
 **Avoids:** Streaming silently broken (Pitfall 1), UI freeze without `StateHasChanged` (Pitfall 4), UI thread starvation from unthrottled renders (PITFALLS performance traps)
 ### Phase 5: Polish and v1.x Features
 **Rationale:** Once the core loop (storage + AI + streaming) is solid, the remaining v1.x features are all low-to-medium complexity additions that build on the established foundation. Grouping them together allows the tutorial narrative to focus on "extending a working app" rather than "getting the basics right."
 **Delivers:** Auto-generated conversation titles (GPT summarization call after first exchange), syntax-highlighted code blocks (`Markdown.ColorCode` Markdig pipeline extension), copy-to-clipboard on code blocks (JS interop via `navigator.clipboard.writeText`), responsive layout for mobile, error handling with user-visible messages, model selector dropdown (GPT-4o vs GPT-4o-mini). Optional v1.x additions: system prompt configuration, export conversation.
 **Uses:** `Markdown.ColorCode` NuGet package (base package, NOT `CSharpToColoredHtml` which breaks WASM), `navigator.clipboard` JS interop
 **Implements:** Updated `MarkdownPipeline` with ColorCode extension, `ClipboardService.cs` JS interop wrapper, settings model for model selection
 ### Phase Ordering Rationale
 - **Architecture before features** prevents the three hardest-to-recover-from mistakes (API key exposure, WASM file I/O, wrong project boundaries) from being baked in.
 - **Storage before AI** follows the feature dependency graph exactly: conversations need a home before AI responses can be stored in them.
 - **Non-streaming before streaming** validates the full request/response shape with simpler code, making streaming easier to debug when it is introduced.
 - **Streaming as its own phase** isolates the highest-risk technical challenge. Combined with the tutorial purpose, this also makes for a clear "here is how streaming actually works in Blazor WASM" chapter.
 - **Polish last** respects the single-responsibility of each phase and avoids complexity interleaving.
 ### Research Flags
 Phases likely needing deeper research during planning (i.e., run `/gsd:research-phase`):
 - **Phase 4 (Streaming):** The `BlazorHttpClientTransport` workaround and SSE frame format have multiple interacting constraints. Phase planning should re-verify the current state of `openai-dotnet` issue #65 and confirm whether .NET 9.x patch releases have changed the default behavior. Token throttling strategy (timer vs counter) also warrants a concrete recommendation.
 - **Phase 5 (Markdown.ColorCode + JS Interop):** The WASM compatibility note (base `Markdown.ColorCode` works; `CSharpToColoredHtml` does not) was sourced from community reports. Verify against the current NuGet package version before implementing.
 Phases with standard patterns (skip research-phase):
 - **Phase 1 (Architecture Foundation):** The two-project solution structure and CORS setup are fully documented in official Microsoft docs. No novel patterns.
 - **Phase 2 (Conversation Storage):** Repository pattern with JSON file I/O is a standard .NET pattern. `SemaphoreSlim` for single-writer serialization is well-documented.
 - **Phase 3 (Basic AI Chat):** OpenAI SDK usage for non-streaming chat completions is documented in the official SDK repo with examples. Markdig integration in Blazor has multiple tutorial references.
 ## Confidence Assessment
 | Area | Confidence | Notes |
 |------|------------|-------|
 | Stack | HIGH | All package versions verified on nuget.org; official SDK confirmed by OpenAI .NET Blog post; version compatibility table verified against published TFM support |
 | Features | HIGH | Feature set cross-referenced against live ChatGPT and Claude UX; OpenAI streaming API docs consulted; Blazor-specific constraints verified |
 | Architecture | HIGH | Microsoft official Blazor docs + verified community implementations (PalmHill.BlazorChat reference); all patterns confirmed with working code samples |
 | Pitfalls | HIGH | Critical pitfalls sourced from official GitHub issue tracker (`openai-dotnet` #65, `aspnetcore` #43098), Microsoft Q&A, and documented production experience |
 **Overall confidence:** HIGH
 ### Gaps to Address
 - **Streaming transport behavior in .NET 9 patch releases:** The `SetBrowserResponseStreamingEnabled(true)` workaround is confirmed required in .NET 9 and becomes default in .NET 10. There is a possibility a .NET 9.x patch release may have changed this behavior. Verify at the start of Phase 4 by checking the official .NET 9 breaking change notes.
 - **StateHasChanged throttling threshold:** Research recommends ~50ms or every N tokens, but the optimal value depends on GPT-4o's actual token delivery rate and the target device's rendering performance. Treat as a tunable constant in code rather than a magic number.
 - **XSS risk of rendering GPT output as MarkupString:** This is a known accepted risk for a single-user personal tool. Document the decision explicitly in the code (tutorial purpose) rather than leaving it as a silent assumption. Consider adding `Markdig`'s `DisableHtml()` pipeline option as a low-friction mitigation.
 - **CORS configuration for deployment:** Research covered localhost development CORS. If the app is ever deployed (even to a home server), the CORS origin list needs updating. Document this as a deployment note in Phase 1.
 ## Sources
 ### Primary (HIGH confidence)
 - https://www.nuget.org/packages/OpenAI — OpenAI 2.9.1 version and publish date confirmed
 - https://github.com/openai/openai-dotnet — Streaming API (`CompleteChatStreamingAsync`, `await foreach`) verified
 - https://www.nuget.org/packages/Markdig — Markdig 1.1.1 confirmed; .NET 8 TFM confirmed .NET 9 compatible
 - https://www.nuget.org/packages/MudBlazor — MudBlazor 9.2.0 confirmed; .NET 8/9/10 support listed
 - https://learn.microsoft.com/en-us/aspnet/core/blazor/hosting-models?view=aspnetcore-9.0 — Standalone WASM vs Blazor Web App distinction; hosted template removal confirmed
 - https://learn.microsoft.com/en-us/dotnet/core/compatibility/networking/10.0/default-http-streaming — WASM streaming opt-in (.NET 9) vs default (.NET 10) breaking change
 - https://learn.microsoft.com/en-us/aspnet/core/blazor/call-web-api?view=aspnetcore-10.0 — HttpClient streaming patterns for Blazor
 - https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/dependency-injection — Official DI lifetime guidance; Scoped = Singleton in WASM
 - https://learn.microsoft.com/en-us/aspnet/core/blazor/security/webassembly/ — API key security; no secrets in WASM bundle
 - https://learn.microsoft.com/en-us/aspnet/core/blazor/performance/rendering — StateHasChanged and re-render control
 - https://devblogs.microsoft.com/dotnet/openai-dotnet-library/ — Official .NET Blog announcement of the OpenAI SDK
 ### Secondary (MEDIUM confidence)
 - https://github.com/openai/openai-dotnet/issues/65 — Confirmed `SetBrowserResponseStreamingEnabled(true)` fix with `BlazorHttpClientTransport`; community-verified workaround
 - https://www.meziantou.net/streaming-an-http-response-in-blazor-webassembly.htm — WASM streaming HttpClient patterns
 - https://www.thinktecture.com/en/blazor/dependency-injection-scopes-in-blazor/ — Scoped = Singleton in WASM; verified against official docs
 - https://www.strathweb.com/2024/07/built-in-support-for-server-sent-events-in-net-9/ — SSE native support in .NET 9
 - https://github.com/edgett/PalmHill.BlazorChat — Reference implementation; WASM + WebAPI + real-time LLM
 - https://dev.to/janhjordie/real-blazor-webassembly-production-pitfalls-3hmf — IL trimming, JS interop, release-only failures
 - https://github.com/dotnet/aspnetcore/issues/43098 — StateHasChanged not firing with IAsyncEnumerable
 ### Tertiary (MEDIUM-LOW confidence, validate before use)
 - https://dev.to/kazinix/blazor-web-app-webassembly-hosted-in-net8-and-net9-1k6g — Hosted template removal in .NET 8+ (single community source; cross-checked against official docs)
 - https://chandradev819.wordpress.com/2025/12/17/10-blazor-coding-mistakes-i-see-in-real-projects-and-how-to-avoid-them/ — Fat component patterns, DI misuse
 - https://www.nuget.org/packages/Markdown.ColorCode — WASM base package compatibility note (community-reported; verify during Phase 5 implementation)
 ---
 *Research completed: 2026-03-27*
 *Ready for roadmap: yes*