docs: complete project research

2026-03-27 00:59:24 +00:00
parent b45ae0400e
commit d9878dea73
6 changed files with 1268 additions and 0 deletions
--- a/.planning/research/PITFALLS.md
+++ b/.planning/research/PITFALLS.md
@@ -0,0 +1,302 @@
+# Pitfalls Research
+
+**Domain:** Blazor WebAssembly AI Chat Application (OpenAI GPT, JSON storage, streaming)
+**Researched:** 2026-03-27
+**Confidence:** HIGH (multiple authoritative sources: official GitHub issues, Microsoft docs, verified community findings)
+
+---
+
+## Critical Pitfalls
+
+### Pitfall 1: Streaming Silently Broken in Blazor WASM Without Custom Transport
+
+**What goes wrong:**
+OpenAI's .NET SDK streaming (`CompleteChatStreamingAsync`, returning `IAsyncEnumerable`) does not stream token-by-token in Blazor WASM. The entire response arrives at once after generation completes, making it appear like a non-streaming call. There is no error thrown — it just does not stream. This is confirmed in the official `openai-dotnet` GitHub issue tracker (#65).
+
+**Why it happens:**
+Blazor WASM uses a browser-based `HttpClient` backed by the Fetch API via JS interop. Response streaming requires explicitly calling `SetBrowserResponseStreamingEnabled(true)` on the underlying `HttpRequestMessage`. The OpenAI .NET SDK does not set this flag by default. Without it, the browser buffers the entire response body before exposing it to the .NET layer.
+
+**How to avoid:**
+Create a custom `HttpClientPipelineTransport` that overrides `OnSendingRequest` to enable browser streaming:
+
+```csharp
+public class BlazorHttpClientTransport : HttpClientPipelineTransport
+{
+    protected override void OnSendingRequest(
+        PipelineMessage message,
+        HttpRequestMessage httpRequest)
+    {
+        httpRequest.SetBrowserResponseStreamingEnabled(true);
+    }
+}
+
+// Wire up at client construction:
+var options = new OpenAIClientOptions();
+options.Transport = new BlazorHttpClientTransport();
+var chatClient = new ChatClient(model: "gpt-4o", apiKey, options);
+```
+
+This must be done server-side (in the backend API that proxies to OpenAI), not in the WASM client directly.
+
+**Warning signs:**
+- Tokens appear all at once after a delay instead of progressively
+- No console errors — it looks like it is working but is not streaming
+- Local dev works "fine" because the full response still arrives, just not incrementally
+
+**Phase to address:**
+Phase that introduces streaming (SSE/token-by-token rendering). Must be addressed before any streaming demo is built.
+
+---
+
+### Pitfall 2: OpenAI API Key Exposed in Blazor WASM Client
+
+**What goes wrong:**
+A developer puts the OpenAI API key in `wwwroot/appsettings.json` or reads it from `IConfiguration` in a WASM component. The key is then downloadable by any browser visitor by requesting `/_framework/blazor.boot.json` or simply navigating to `wwwroot/appsettings.json` directly. The key is burned.
+
+**Why it happens:**
+Experienced C# developers coming from ASP.NET Core or console apps expect `appsettings.json` and `IConfiguration` to be server-side. In Blazor WASM, `wwwroot/appsettings.json` is a static file served to the browser — it is not protected in any way. User Secrets also do not help: they are embedded into the published bundle in plaintext.
+
+**How to avoid:**
+The OpenAI API key must live exclusively in the backend API (ASP.NET Core Minimal API or Web API project). The WASM client calls a backend endpoint (e.g., `/api/chat`) that holds the key and proxies requests to OpenAI. The key never appears in any client-side file or JS bundle. Use `dotnet user-secrets` on the server project only.
+
+**Warning signs:**
+- Any `IConfiguration["OpenAI:ApiKey"]` usage in a `.razor` file or service registered in the WASM `Program.cs`
+- `appsettings.json` in `wwwroot/` containing any token, key, or secret
+- The word `sk-` visible in browser DevTools → Network → response body for any `.json` file
+
+**Phase to address:**
+Phase 1 (project setup / architecture foundation). This must be locked in before any OpenAI code is written.
+
+---
+
+### Pitfall 3: Scoped DI Services Act as Singletons in WASM — State Leaks Across Conversations
+
+**What goes wrong:**
+A developer registers a `ConversationService` or `ChatStateService` as `Scoped`, expecting it to reset between logical "sessions" like it would in ASP.NET Core (per-request). In Blazor WASM there is exactly one DI scope for the lifetime of the browser tab. The service never resets. All conversations accumulate state in a single object, producing corrupted cross-conversation history.
+
+**Why it happens:**
+In ASP.NET Core, Scoped = per HTTP request. In Blazor WASM, Scoped = per application lifetime (equivalent to Singleton). There is no shorter-lived scope unless you use `OwningComponentBase`. Developers familiar with server-side DI expect different behavior.
+
+**How to avoid:**
+- Understand that in WASM, `Scoped` and `Singleton` are functionally identical
+- For services that manage per-conversation state, design them to hold a collection keyed by conversation ID rather than holding mutable "current conversation" state
+- If a service must be component-scoped, inherit from `OwningComponentBase` which creates a DI scope tied to the component's lifetime
+- Never store mutable "active session" state in a scoped/singleton service; store a dictionary of `ConversationId → ConversationState`
+
+**Warning signs:**
+- Switching conversations causes the wrong history to appear
+- Deleting a conversation does not fully clear its state from memory
+- Services have fields like `CurrentConversation` or `ActiveMessages` rather than `Dictionary<Guid, Conversation>`
+
+**Phase to address:**
+Phase introducing conversation state management and multi-conversation switching.
+
+---
+
+### Pitfall 4: `StateHasChanged` Not Called During Token Streaming — UI Freezes Until Completion
+
+**What goes wrong:**
+The developer wires up streaming correctly (transport fixed, backend proxying works) but the UI does not update token-by-token. The message bubble stays empty until all tokens arrive, then the entire response appears at once. This is indistinguishable from the streaming transport bug (Pitfall 1) if not diagnosed carefully.
+
+**Why it happens:**
+Blazor does not automatically re-render after every `await` inside an `async` event handler or lifecycle method. When consuming an `IAsyncEnumerable<string>` (streaming tokens), the component must explicitly call `StateHasChanged()` after appending each token. Without this call, Blazor batches rendering and only repaints when the entire method completes.
+
+**How to avoid:**
+```csharp
+await foreach (var token in streamingResponse)
+{
+    currentMessage += token;
+    StateHasChanged(); // required — Blazor will not re-render otherwise
+    await Task.Yield(); // prevents UI thread starvation on rapid token delivery
+}
+```
+
+Additionally, consider throttling `StateHasChanged` calls (e.g., every 50ms or every N tokens) to avoid excessive rendering if token delivery is very fast.
+
+**Warning signs:**
+- Streaming transport is confirmed working (via backend logs) but UI still shows nothing until complete
+- Token-by-token updates visible in server logs but not in the browser
+- Removing `await Task.Yield()` causes the browser tab to become unresponsive during streaming
+
+**Phase to address:**
+Streaming UI rendering phase. Document this explicitly inline in the component code.
+
+---
+
+### Pitfall 5: JSON File Storage Architecture Assumes Server Filesystem — Not Viable Pure Client-Side
+
+**What goes wrong:**
+A developer writes file I/O code (`File.ReadAllText`, `File.WriteAllText`) directly in the WASM project. The code compiles without error but throws at runtime because Blazor WASM runs in a browser sandbox with no access to the host filesystem. The virtual WASM filesystem resets on every page refresh.
+
+**Why it happens:**
+C# file APIs exist in the WASM .NET runtime but map to an in-memory virtual filesystem, not the OS disk. Developers coming from console or desktop C# assume `File.WriteAllText("conversations.json", json)` writes to disk. It does not — the data vanishes on refresh.
+
+**How to avoid:**
+JSON file storage must live in the backend API (server-side), not in the WASM client. The correct architecture:
+- WASM client calls `POST /api/conversations` → backend writes JSON to disk
+- WASM client calls `GET /api/conversations` → backend reads JSON from disk and returns it
+- Backend stores files in a configurable local path (e.g., `~/chat-data/`)
+
+This reinforces the same architectural boundary required for API key protection (Pitfall 2).
+
+**Warning signs:**
+- Any `System.IO.File` or `System.IO.Directory` usage inside the WASM project (`Client/`)
+- Conversations persist during a session but disappear on browser refresh
+- Data is present in the WASM virtual FS (`MemoryFileSystem`) but absent from the OS
+
+**Phase to address:**
+Phase 1 (architecture setup). The WASM/backend split must be established before any persistence code is written.
+
+---
+
+### Pitfall 6: IL Trimming Silently Breaks Code in Release Builds
+
+**What goes wrong:**
+The app works perfectly in `dotnet run` (Debug) but breaks in `dotnet publish` (Release). JSON serialization loses properties, services cannot be resolved, or features silently stop working. No exceptions in development, cryptic failures in production.
+
+**Why it happens:**
+Blazor WASM uses aggressive IL trimming (ILLink) during publish to reduce bundle size. The trimmer performs static analysis and removes types/methods that appear unreachable — including types used only via reflection (JSON serialization, DI, JSInterop callbacks). Debug builds do not trim.
+
+**How to avoid:**
+- Use `[JsonSerializable]` with `System.Text.Json` source generation for all DTO types
+- Apply `[DynamicDependency]` to methods called via reflection
+- Apply `[JSInvokable]` to all methods callable from JavaScript
+- Run `dotnet publish` early in the project (Phase 1 or 2) to detect trim warnings while the surface is small
+- Treat `<TrimmerRootDescriptor>` as a last resort, not a first step
+
+**Warning signs:**
+- App works in `dotnet run` but throws `NullReferenceException` or loses data after `dotnet publish`
+- JSON responses missing properties that were present in debug
+- IL trimmer warnings during publish that were ignored
+
+**Phase to address:**
+Phase 1 (publish pipeline verification). Also Phase covering JSON data models for conversations.
+
+---
+
+## Technical Debt Patterns
+
+| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
+|----------|-------------------|----------------|-----------------|
+| Put API key in WASM `appsettings.json` | Faster to get OpenAI call working | Key is permanently burned; must rotate | Never |
+| Call OpenAI directly from WASM HttpClient | Eliminates backend project | API key exposed; no server-side rate limiting; blocks v2 RAG/LangChain which needs server | Never |
+| Write file I/O in WASM project | Familiar C# patterns | Silent data loss on refresh; hard to migrate later | Never |
+| All logic in `.razor` files | Faster iteration in early phases | Untestable; components become unmaintainable; hard to add v2 agent layer | Phase 1 only, refactor before Phase 3 |
+| Single `ChatService` singleton holding all state | Simple to start | State leaks across conversations; breaks multi-conversation feature | Never in this project — multi-conversation is a core requirement |
+| Skip `StateHasChanged` calls during streaming | Code is simpler | UI appears broken; streaming appears non-functional | Never |
+| Skip IL trim testing until "done" | Saves time during early phases | Trim bugs compound as codebase grows | Acceptable in Phase 1-2 if `dotnet publish` test is added to Phase 3 |
+
+---
+
+## Integration Gotchas
+
+| Integration | Common Mistake | Correct Approach |
+|-------------|----------------|-----------------|
+| OpenAI .NET SDK + Blazor WASM | Using SDK directly in WASM project without custom transport — streaming silently broken | Custom `BlazorHttpClientTransport` with `SetBrowserResponseStreamingEnabled(true)` on backend; or server-side only call |
+| OpenAI streaming via backend proxy | Backend returns full `StreamingChatCompletionUpdate` objects — client gets batched response | Backend uses `IAsyncEnumerable` with `[EnumeratorCancellation]`; streams SSE or NDJSON to client |
+| CORS between WASM client and backend API | Forgetting `AddCors` + `UseCors` on backend; 403/CORS errors during local dev | Configure CORS policy explicitly in backend `Program.cs`; scope to localhost dev origins |
+| JSON serialization of conversation models | Properties stripped by trimmer in Release; conversation history loses fields | Use `System.Text.Json` source generators with `[JsonSerializable]` for all model types |
+| Markdown rendering (AI responses) | Using `MarkupString` with raw AI output — XSS risk if AI returns script tags | Use a library like `Markdig` server-side, or a client-side library with HTML sanitization enabled |
+
+---
+
+## Performance Traps
+
+| Trap | Symptoms | Prevention | When It Breaks |
+|------|----------|------------|----------------|
+| Calling `StateHasChanged` on every token without throttling | Browser tab becomes unresponsive during fast streaming; CPU spikes | Throttle to every 50ms or every N tokens using a timer or counter | At ~20+ tokens/second (typical GPT-4o speed) |
+| Re-rendering entire conversation list on every message append | Visible flicker; full list DOM re-created on each token | Use `@key` directive on conversation list items; isolate streaming component from sidebar | At 5+ conversations in the list |
+| Loading all conversation history on app start | Slow initial load for users with many old conversations | Lazy-load conversation content; sidebar shows metadata only; load full messages on selection | At 50+ conversations stored in JSON |
+| Excessive component nesting for chat messages | Sluggish scroll performance with many messages | Keep message list in a single component with virtualization (`Virtualize`) for long histories | At 200+ messages in a single conversation |
+
+---
+
+## Security Mistakes
+
+| Mistake | Risk | Prevention |
+|---------|------|------------|
+| OpenAI API key in `wwwroot/appsettings.json` | Key visible to any browser user; unlimited API charges | Key lives only in backend server project; accessed via `dotnet user-secrets` or environment variable |
+| Calling OpenAI API directly from WASM (no backend) | Same as above; also bypasses rate limiting and logging | Mandatory backend proxy — this is enforced by the architecture from Phase 1 |
+| Rendering AI response HTML without sanitization | AI model could produce `<script>` tags; XSS attack on self | Use Markdown-to-HTML library with HTML sanitization; never use raw `MarkupString` on LLM output |
+| Storing secrets in WASM `IConfiguration` | Even runtime config values in WASM are readable from browser | All secrets in server-side `IConfiguration` only; WASM config contains only public values (API endpoint URLs) |
+| CORS wildcard (`AllowAnyOrigin`) in backend | Allows any site to call the local chat backend | Restrict CORS to `localhost` origins during development; lock down on deploy |
+
+---
+
+## UX Pitfalls
+
+| Pitfall | User Impact | Better Approach |
+|---------|-------------|-----------------|
+| No loading indicator while waiting for first streaming token | App appears frozen; user thinks it broke | Show typing indicator / spinner immediately on message send; hide when first token arrives |
+| No way to cancel an in-progress stream | User must wait for full response even if they sent the wrong message | Pass `CancellationToken` through to streaming call; expose cancel button that calls `cts.Cancel()` |
+| Auto-scroll that fights user scroll position | User scrolls up to read history; app violently scrolls back to bottom on each token | Only auto-scroll if the user is already at the bottom; detect scroll position before each `StateHasChanged` |
+| No error message when OpenAI API call fails | Empty response bubble; user does not know what happened | `try/catch` around all API calls; display inline error in message bubble with retry option |
+| Conversation list has no visual indication of active conversation | User loses track of which conversation is displayed | Highlight active conversation item; update `document.title` with conversation name |
+| Markdown rendered as raw text | AI responses with code blocks and lists look like symbol-laden garbage | Wire up Markdown renderer before any AI content is displayed — do not defer this |
+
+---
+
+## "Looks Done But Isn't" Checklist
+
+- [ ] **Streaming:** Tokens appear progressively in the UI — verify this is actual token-by-token delivery, not a batched response that arrives quickly. Check with a slow prompt.
+- [ ] **API key security:** Open DevTools → Network → find any `.json` request → confirm no `sk-` prefixed values appear in any response body.
+- [ ] **Conversation persistence:** Close and reopen the browser tab (not just refresh). Confirm conversations are still present.
+- [ ] **Multi-conversation isolation:** Open conversation A, send a message. Switch to conversation B. Verify conversation A's messages do not bleed into B.
+- [ ] **Stream cancellation:** Start a long generation. Click cancel or navigate away. Confirm the backend stops consuming tokens (check backend logs).
+- [ ] **Release build:** Run `dotnet publish` at least once before calling any phase "done." Confirm the published app loads and all features work.
+- [ ] **Error handling:** Temporarily set an invalid API key. Confirm the UI shows a user-friendly error rather than a blank component or silent failure.
+- [ ] **Markdown rendering:** Ask the AI for a code snippet. Confirm it renders in a code block, not as raw backtick-surrounded text.
+
+---
+
+## Recovery Strategies
+
+| Pitfall | Recovery Cost | Recovery Steps |
+|---------|---------------|----------------|
+| API key exposed in WASM | HIGH | Immediately rotate key in OpenAI dashboard; add `Client/` project scan to CI to block any future secret patterns |
+| All logic in `.razor` files (discovered late) | MEDIUM | Extract services incrementally — one component per PR; do not refactor all at once |
+| File I/O written in WASM project | MEDIUM | Move all persistence calls to backend API; update WASM to use `HttpClient` calls to new endpoints |
+| Streaming not working (transport not set) | LOW | Add `BlazorHttpClientTransport` wrapper — 10-line fix once identified; the hard part is diagnosing it |
+| Scoped service holding mutable conversation state | MEDIUM | Redesign service to hold `Dictionary<Guid, ConversationState>`; update all call sites |
+| IL trim breaks release build | MEDIUM–HIGH (depends on when discovered) | Add source generators for all model types; treat every trim warning as a compile error |
+
+---
+
+## Pitfall-to-Phase Mapping
+
+| Pitfall | Prevention Phase | Verification |
+|---------|------------------|--------------|
+| API key in WASM client | Phase 1: Project setup and architecture | Grep WASM project for `sk-`; confirm key only in backend `user-secrets` |
+| File I/O in WASM project | Phase 1: Project setup and architecture | Grep WASM project for `System.IO.File`; confirm persistence is backend-only |
+| Direct OpenAI call from WASM | Phase 1: Project setup and architecture | Confirm no OpenAI SDK registration in WASM `Program.cs` |
+| CORS misconfiguration | Phase 1: First HTTP call from WASM to backend | Verify browser console shows no CORS errors during local dev |
+| Scoped DI lifetime confusion | Phase covering conversation state management | Test: create two conversations; switch between them; verify history isolation |
+| Streaming transport not set | Phase introducing token-by-token streaming | Observe: tokens must appear incrementally; verify with slow prompt or network throttle |
+| StateHasChanged missing in stream loop | Phase introducing token-by-token streaming | Same as above — visible as UI not updating mid-stream |
+| UI freeze without streaming throttle | Phase polishing streaming UI | CPU profiler during active stream; verify no jank |
+| IL trimming breaks release | Phase 1 (publish test) + any phase adding new model types | Run `dotnet publish` as part of each phase completion check |
+| Markdown XSS via raw MarkupString | Phase introducing Markdown rendering | Code review: confirm no `new MarkupString(aiResponse)` without prior sanitization |
+| Auto-scroll fighting user scroll | Phase building chat message UI | Manual test: scroll up mid-stream; verify auto-scroll does not override |
+| No cancel button | Phase building streaming UI | Test: start stream, navigate away; check backend logs confirm stream terminated |
+
+---
+
+## Sources
+
+- [openai/openai-dotnet Issue #65 — Streaming doesn't work properly in Blazor WASM](https://github.com/openai/openai-dotnet/issues/65) — confirmed fix with `BlazorHttpClientTransport`
+- [Microsoft Q&A — Streaming Issue with Blazor WebAssembly and Semantic Kernel and OpenAI](https://learn.microsoft.com/en-sg/answers/questions/2242618/streaming-issue-with-blazor-webassembly-and-semati)
+- [DEV Community — Real Blazor WebAssembly Production Pitfalls](https://dev.to/janhjordie/real-blazor-webassembly-production-pitfalls-3hmf) — IL trimming, JS interop, release-only failures
+- [Chandradev Blog — 10 Blazor Coding Mistakes](https://chandradev819.wordpress.com/2025/12/17/10-blazor-coding-mistakes-i-see-in-real-projects-and-how-to-avoid-them/) — logic in components, DI misuse, naming
+- [Thinktecture — Dependency Injection Scopes in Blazor](https://www.thinktecture.com/en/blazor/dependency-injection-scopes-in-blazor/) — Scoped = Singleton in WASM
+- [ASP.NET Core Blazor DI docs](https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/dependency-injection) — official lifetime guidance
+- [ASP.NET Core Blazor rendering performance best practices](https://learn.microsoft.com/en-us/aspnet/core/blazor/performance/rendering) — StateHasChanged, re-render control
+- [dotnet/aspnetcore Issue #43098 — StateHasChanged not firing with IAsyncEnumerable](https://github.com/dotnet/aspnetcore/issues/43098)
+- [Microsoft — Secure ASP.NET Core Blazor WebAssembly](https://learn.microsoft.com/en-us/aspnet/core/blazor/security/webassembly/) — API key security, no secrets in WASM
+- [DEV Community — The Missing Third Config Layer: User Secrets in Blazor WASM](https://dev.to/j_sakamoto/the-missing-third-config-layer-adding-user-secrets-to-blazor-webassembly-2a5a) — confirms user secrets are NOT secret in WASM
+- [Microsoft — Blazor WebAssembly file access Q&A](https://learn.microsoft.com/en-us/answers/questions/1290337/blazor-webassembly-get-file-access) — browser sandbox, no local disk
+- [tpeczek.com — ASP.NET Core 9 and IAsyncEnumerable — Async Streaming from Blazor WASM](https://www.tpeczek.com/2024/09/aspnet-core-9-and-iasyncenumerable.html) — correct streaming patterns
+- [dotnet/aspnetcore Issue #55982 — network error with IAsyncEnumerable streaming in WASM](https://github.com/dotnet/aspnetcore/issues/55982)
+
+---
+*Pitfalls research for: Blazor WebAssembly AI Chat Application*
+*Researched: 2026-03-27*