22 KiB
Project Research Summary
Project: Blazor WebAssembly AI Chat Application Domain: Single-user personal AI chat web app (.NET / C# / OpenAI GPT) Researched: 2026-03-27 Confidence: HIGH
Executive Summary
This is a single-user personal AI chat application built on Blazor WebAssembly with an ASP.NET Core backend. The project has a dual purpose: functioning as a useful personal tool and serving as a tutorial-quality reference implementation for Blazor WASM patterns. The recommended architecture is a strict two-project split — a standalone Blazor WASM client and a separate ASP.NET Core Minimal API server — reflecting a breaking change in .NET 8+ that removed the hosted Blazor WASM template. The client runs entirely in the browser; the server holds secrets, calls OpenAI, and manages disk persistence. This boundary is non-negotiable and must be established before any feature code is written.
The core technical challenge is streaming. OpenAI's token-by-token streaming requires explicit opt-in in Blazor WASM (SetBrowserResponseStreamingEnabled(true)) that the SDK does not set by default — the stream silently falls back to buffered delivery with no error. Combined with the need to call StateHasChanged() on every token to update the UI, streaming is the highest-risk implementation step and must be validated early. All other features — conversation management, markdown rendering, copy-to-clipboard — are well-understood patterns with clear .NET implementations.
The key risk profile is concentrated in Phase 1 (architecture foundation) and the streaming phase. Three "never" mistakes — putting the API key in WASM, writing file I/O in the WASM project, and calling OpenAI directly from WASM — must be locked out architecturally before feature development begins. Once those boundaries are established, the remainder of the v1 feature set follows a clear dependency chain from storage to conversations to streaming to UI polish.
Key Findings
Recommended Stack
The stack is .NET 9 throughout: a blazorwasm standalone client and a webapi backend in a single solution, connected by HTTP and SSE. There are no exotic dependencies — the official OpenAI NuGet package (2.9.1, published by OpenAI directly) handles AI calls, Markdig (1.1.1) handles markdown-to-HTML conversion, MudBlazor (9.2.0) provides Material Design UI components with zero JavaScript dependencies, and System.Text.Json (built-in) handles JSON serialization and file storage. All versions are confirmed compatible with .NET 9.
The most important stack decision is what to exclude: do not use OpenAI-DotNet (unofficial community package), Microsoft.SemanticKernel (excessive abstraction for v1), Newtonsoft.Json (superseded by System.Text.Json), Blazored.LocalStorage (wrong persistence layer for this architecture), or JSInterop for streaming (WASM has a native streaming opt-in that avoids it).
Core technologies:
.NET 9 SDK + C# 13: Runtime and language — stable, LTS-adjacent, both project types targetnet9.0Blazor WebAssembly (standalone): Client SPA — non-negotiable per project constraints; runs in-browser with no server round-trip for UIASP.NET Core Minimal API: Backend proxy — required to keep the OpenAI key server-side and to handle disk I/O that WASM cannot performOpenAI2.9.1 (official): AI calls —CompleteChatStreamingAsync()withawait foreach; the only correct .NET SDK choiceMarkdig1.1.1: Markdown rendering — de facto .NET standard; CommonMark-compliant; renders via@((MarkupString)html)with no JS interopMudBlazor9.2.0: UI components — pure C#, zero JS dependencies, comprehensive chat-friendly components, free licenseSystem.Text.Json(built-in): Persistence — serialize conversations to JSON files on the server; no extra dependency
Expected Features
The feature set is well-defined by comparison to ChatGPT and Claude as reference products. The v1 scope is intentionally constrained to what makes the app genuinely usable, with explicit anti-features documented to prevent scope creep during implementation.
Must have (table stakes):
- Send message / receive streaming response — core loop; blocking responses are unacceptable by 2026 standards
- Markdown rendering with syntax-highlighted code blocks — GPT always responds with markdown; raw output is unusable
- Multiple named conversations with create / switch / delete — without this, the app is a single disposable thread
- JSON file persistence across sessions — conversations must survive page refresh to be useful
- Auto-scroll to latest message and loading indicator — baseline polish that makes the app feel complete
- Copy-to-clipboard on code blocks — high-frequency action for the developer-focused target user
- Input disabled during streaming and send-on-Enter — prevents double-submit and matches chat conventions
- Tutorial-style inline code comments — the project's defining purpose as a learning resource
Should have (competitive, v1.x):
- Auto-generated conversation titles — reduces naming friction; single GPT summarization call
- System prompt / persona configuration — power-user feature; natural extension once multi-conversation works
- Model selector (GPT-4o vs GPT-4o-mini) — cost/quality tradeoff; low implementation cost
- Export conversation to markdown/text — low complexity, occasional high value
Defer (v2+):
- Message edit and regenerate — medium complexity; wait until core loop is solid
- Token usage display — streaming completion handling required; not blocking
- LangChain / agentic workflows, RAG, MCP server integration — explicitly v2 per project intent
- Voice input/output, image uploads, multi-user auth, PWA — all documented anti-features with clear rationale
Feature dependencies are explicit: JSON storage must precede conversation management, which must precede conversation switching. A basic blocking API call must precede streaming. Markdown must precede syntax highlighting, which must precede copy-to-clipboard.
Architecture Approach
The architecture is a strict two-tier system: a Blazor WASM SPA in the browser communicating with an ASP.NET Core Minimal API via HTTP and SSE. State on the client is managed by a singleton ConversationStateService that raises OnChange events — components subscribe in OnInitialized and unsubscribe in Dispose. There is a Shared library project that holds Conversation and ChatMessage models used by both tiers, eliminating duplicate DTOs.
Components are kept intentionally thin (data in via [Parameter], actions out via EventCallback<T>). All logic lives in services. This is explicitly stated as a tutorial goal — fat components teach bad habits and are hard to explain.
Major components:
ConversationStateService(WASM singleton) — active conversation, message list, streaming flag; raisesOnChangefor all subscribed componentsChatApiClient(WASM scoped service) — wrapsHttpClient, handles SSE stream reading withSetBrowserResponseStreamingEnabled(true)andResponseHeadersReadOpenAiService(server scoped) — wraps official OpenAI SDK, returnsIAsyncEnumerable<string>of tokens to endpoint handlersConversationRepository(server singleton) — reads/writes JSON files under a configurable data directory; usesSemaphoreSlim(1,1)for write serializationChatEndpoints+ConversationEndpoints(server Minimal API) — thin HTTP layer wiring services to routes; SSE streaming endpoint proxies tokens to client- Leaf UI components:
MessageBubble,ChatInput,ConversationList,MessageList— pure display, no service calls - Container component:
ChatPage— composes all child components, owns the route (@page "/chat/{id?}")
Build order: Shared models → ConversationRepository → OpenAiService → server endpoints → ChatApiClient → ConversationStateService → leaf UI → container UI. This maps directly to implementation phases.
Critical Pitfalls
-
Streaming silently broken in WASM (Pitfall 1 + 4) — Two distinct failure modes that appear identical: (a) the OpenAI SDK does not set
SetBrowserResponseStreamingEnabled(true)so the browser buffers the entire response; (b)StateHasChanged()is not called per-token so Blazor batches all renders until the stream completes. Both produce the same symptom — tokens appear all at once. Fix: customBlazorHttpClientTransporton the backend, and explicitStateHasChanged()+await Task.Yield()inside theawait foreachtoken loop. Throttle to ~50ms intervals to prevent UI thread starvation at GPT-4o token speeds. -
API key exposure in WASM (Pitfall 2) —
wwwroot/appsettings.jsonis a static file served to any browser visitor.dotnet user-secretsin WASM projects are embedded in the published bundle in plaintext. The key must live exclusively in the server project, accessed via server-sideuser-secretsor environment variables. This boundary must be established in Phase 1 and never crossed. -
File I/O in WASM project (Pitfall 5) —
System.IO.Filecompiles in WASM but writes to an in-memory virtual filesystem that resets on every page refresh. All persistence must go through backend API endpoints. Reinforce the same architectural boundary as the API key rule. -
Scoped DI = Singleton in WASM (Pitfall 3) — In Blazor WASM there is exactly one DI scope for the tab lifetime. A service registered as
Scopednever resets. DesignConversationStateServiceto hold a collection keyed by conversation ID, not mutable "current conversation" fields. -
IL trimming breaks Release builds (Pitfall 6) — Debug builds do not trim; published builds do. JSON serialization properties, DI-resolved types, and JSInterop callbacks can be silently stripped. Use
[JsonSerializable]source generators on all model types and rundotnet publishonce in Phase 1 to catch trim warnings while the surface is small.
Implications for Roadmap
Based on combined research, the architecture dependency chain and pitfall prevention requirements suggest five phases:
Phase 1: Architecture Foundation
Rationale: Three critical "never" mistakes (API key in WASM, file I/O in WASM, direct OpenAI call from WASM) must be architecturally locked before any feature code is written. The WASM/backend split is the load-bearing constraint everything else depends on. This phase also establishes the Shared models library which both tiers need immediately.
Delivers: Working solution structure with two projects + shared library; CORS configured; basic HTTP connectivity verified WASM-to-server; dotnet publish tested once to catch IL trim warnings early; placeholder endpoints in place; no OpenAI calls yet.
Addresses: Project scaffolding, solution structure (FEATURES.md scaffolding prerequisite)
Avoids: API key exposure (Pitfall 2), file I/O in WASM (Pitfall 5), direct OpenAI calls from WASM (Architecture anti-pattern 1), IL trimming surprises (Pitfall 6)
Phase 2: Conversation Storage and Management
Rationale: JSON file storage is the prerequisite for every conversation-related feature. Per the feature dependency graph: [JSON File Storage] → [Multiple Conversations] → [Create/Switch/Delete/Persist]. This phase must come before any AI integration because the persistence layer needs to exist before we can store AI responses.
Delivers: ConversationRepository with full CRUD, ConversationEndpoints wired to HTTP routes, ConversationList sidebar component, create/switch/delete conversations working, conversation history persisted to disk and loaded on startup. The app has no AI yet but has a working conversation management UI.
Uses: System.Text.Json built-in, SemaphoreSlim(1,1) for write serialization, MudBlazor for sidebar components
Implements: ConversationRepository, ConversationEndpoints, ConversationStateService (initial version), ConversationList.razor
Avoids: Scoped DI state leaks (Pitfall 3) — design ConversationStateService with Dictionary<Guid, ConversationState> from the start
Phase 3: Basic AI Chat (Non-Streaming)
Rationale: Per the feature dependency chain, a working blocking API call must be established before streaming is layered on top. Building non-streaming first validates the full request/response shape, CORS, error handling, and conversation history construction without the added complexity of SSE. This is the correct learning sequence for a tutorial project.
Delivers: Full chat loop working end-to-end: user sends message → backend calls OpenAI → response appended to conversation → conversation saved to disk. All without streaming. Markdown rendering added here because GPT responses with raw markdown are effectively unusable and would make all testing painful.
Uses: OpenAI 2.9.1 SDK, Markdig 1.1.1, MudBlazor chat components
Implements: OpenAiService, ChatEndpoints (non-streaming POST), ChatApiClient (basic POST), MessageBubble.razor with @((MarkupString)html) rendering, ChatInput.razor
Avoids: Markdown XSS via raw MarkupString (PITFALLS integration gotchas) — sanitize or accept risk explicitly in code comments
Phase 4: Streaming Responses
Rationale: Streaming is the highest-risk implementation step. Research identified two independent failure modes (transport not set, StateHasChanged not called) that produce identical symptoms. Addressing this in its own phase means streaming can be diagnosed and debugged in isolation, without other variables. All streaming-specific patterns — BlazorHttpClientTransport, SSE endpoint, ResponseHeadersRead, per-token StateHasChanged with throttling — are introduced and documented here.
Delivers: Token-by-token streaming from OpenAI through the backend SSE endpoint to the WASM UI. Loading indicator shown immediately on send, hidden on first token. Auto-scroll to latest message. Input disabled during streaming. Cancel button wired to CancellationToken. Stream throttling (~50ms) to prevent UI thread starvation.
Uses: SetBrowserResponseStreamingEnabled(true), HttpCompletionOption.ResponseHeadersRead, text/event-stream SSE frames, await Task.Yield() in token loop
Implements: Streaming ChatEndpoints, updated ChatApiClient with stream reader, updated MessageList and ChatPage with streaming state
Avoids: Streaming silently broken (Pitfall 1), UI freeze without StateHasChanged (Pitfall 4), UI thread starvation from unthrottled renders (PITFALLS performance traps)
Phase 5: Polish and v1.x Features
Rationale: Once the core loop (storage + AI + streaming) is solid, the remaining v1.x features are all low-to-medium complexity additions that build on the established foundation. Grouping them together allows the tutorial narrative to focus on "extending a working app" rather than "getting the basics right."
Delivers: Auto-generated conversation titles (GPT summarization call after first exchange), syntax-highlighted code blocks (Markdown.ColorCode Markdig pipeline extension), copy-to-clipboard on code blocks (JS interop via navigator.clipboard.writeText), responsive layout for mobile, error handling with user-visible messages, model selector dropdown (GPT-4o vs GPT-4o-mini). Optional v1.x additions: system prompt configuration, export conversation.
Uses: Markdown.ColorCode NuGet package (base package, NOT CSharpToColoredHtml which breaks WASM), navigator.clipboard JS interop
Implements: Updated MarkdownPipeline with ColorCode extension, ClipboardService.cs JS interop wrapper, settings model for model selection
Phase Ordering Rationale
- Architecture before features prevents the three hardest-to-recover-from mistakes (API key exposure, WASM file I/O, wrong project boundaries) from being baked in.
- Storage before AI follows the feature dependency graph exactly: conversations need a home before AI responses can be stored in them.
- Non-streaming before streaming validates the full request/response shape with simpler code, making streaming easier to debug when it is introduced.
- Streaming as its own phase isolates the highest-risk technical challenge. Combined with the tutorial purpose, this also makes for a clear "here is how streaming actually works in Blazor WASM" chapter.
- Polish last respects the single-responsibility of each phase and avoids complexity interleaving.
Research Flags
Phases likely needing deeper research during planning (i.e., run /gsd:research-phase):
- Phase 4 (Streaming): The
BlazorHttpClientTransportworkaround and SSE frame format have multiple interacting constraints. Phase planning should re-verify the current state ofopenai-dotnetissue #65 and confirm whether .NET 9.x patch releases have changed the default behavior. Token throttling strategy (timer vs counter) also warrants a concrete recommendation. - Phase 5 (Markdown.ColorCode + JS Interop): The WASM compatibility note (base
Markdown.ColorCodeworks;CSharpToColoredHtmldoes not) was sourced from community reports. Verify against the current NuGet package version before implementing.
Phases with standard patterns (skip research-phase):
- Phase 1 (Architecture Foundation): The two-project solution structure and CORS setup are fully documented in official Microsoft docs. No novel patterns.
- Phase 2 (Conversation Storage): Repository pattern with JSON file I/O is a standard .NET pattern.
SemaphoreSlimfor single-writer serialization is well-documented. - Phase 3 (Basic AI Chat): OpenAI SDK usage for non-streaming chat completions is documented in the official SDK repo with examples. Markdig integration in Blazor has multiple tutorial references.
Confidence Assessment
| Area | Confidence | Notes |
|---|---|---|
| Stack | HIGH | All package versions verified on nuget.org; official SDK confirmed by OpenAI .NET Blog post; version compatibility table verified against published TFM support |
| Features | HIGH | Feature set cross-referenced against live ChatGPT and Claude UX; OpenAI streaming API docs consulted; Blazor-specific constraints verified |
| Architecture | HIGH | Microsoft official Blazor docs + verified community implementations (PalmHill.BlazorChat reference); all patterns confirmed with working code samples |
| Pitfalls | HIGH | Critical pitfalls sourced from official GitHub issue tracker (openai-dotnet #65, aspnetcore #43098), Microsoft Q&A, and documented production experience |
Overall confidence: HIGH
Gaps to Address
- Streaming transport behavior in .NET 9 patch releases: The
SetBrowserResponseStreamingEnabled(true)workaround is confirmed required in .NET 9 and becomes default in .NET 10. There is a possibility a .NET 9.x patch release may have changed this behavior. Verify at the start of Phase 4 by checking the official .NET 9 breaking change notes. - StateHasChanged throttling threshold: Research recommends ~50ms or every N tokens, but the optimal value depends on GPT-4o's actual token delivery rate and the target device's rendering performance. Treat as a tunable constant in code rather than a magic number.
- XSS risk of rendering GPT output as MarkupString: This is a known accepted risk for a single-user personal tool. Document the decision explicitly in the code (tutorial purpose) rather than leaving it as a silent assumption. Consider adding
Markdig'sDisableHtml()pipeline option as a low-friction mitigation. - CORS configuration for deployment: Research covered localhost development CORS. If the app is ever deployed (even to a home server), the CORS origin list needs updating. Document this as a deployment note in Phase 1.
Sources
Primary (HIGH confidence)
- https://www.nuget.org/packages/OpenAI — OpenAI 2.9.1 version and publish date confirmed
- https://github.com/openai/openai-dotnet — Streaming API (
CompleteChatStreamingAsync,await foreach) verified - https://www.nuget.org/packages/Markdig — Markdig 1.1.1 confirmed; .NET 8 TFM confirmed .NET 9 compatible
- https://www.nuget.org/packages/MudBlazor — MudBlazor 9.2.0 confirmed; .NET 8/9/10 support listed
- https://learn.microsoft.com/en-us/aspnet/core/blazor/hosting-models?view=aspnetcore-9.0 — Standalone WASM vs Blazor Web App distinction; hosted template removal confirmed
- https://learn.microsoft.com/en-us/dotnet/core/compatibility/networking/10.0/default-http-streaming — WASM streaming opt-in (.NET 9) vs default (.NET 10) breaking change
- https://learn.microsoft.com/en-us/aspnet/core/blazor/call-web-api?view=aspnetcore-10.0 — HttpClient streaming patterns for Blazor
- https://learn.microsoft.com/en-us/aspnet/core/blazor/fundamentals/dependency-injection — Official DI lifetime guidance; Scoped = Singleton in WASM
- https://learn.microsoft.com/en-us/aspnet/core/blazor/security/webassembly/ — API key security; no secrets in WASM bundle
- https://learn.microsoft.com/en-us/aspnet/core/blazor/performance/rendering — StateHasChanged and re-render control
- https://devblogs.microsoft.com/dotnet/openai-dotnet-library/ — Official .NET Blog announcement of the OpenAI SDK
Secondary (MEDIUM confidence)
- https://github.com/openai/openai-dotnet/issues/65 — Confirmed
SetBrowserResponseStreamingEnabled(true)fix withBlazorHttpClientTransport; community-verified workaround - https://www.meziantou.net/streaming-an-http-response-in-blazor-webassembly.htm — WASM streaming HttpClient patterns
- https://www.thinktecture.com/en/blazor/dependency-injection-scopes-in-blazor/ — Scoped = Singleton in WASM; verified against official docs
- https://www.strathweb.com/2024/07/built-in-support-for-server-sent-events-in-net-9/ — SSE native support in .NET 9
- https://github.com/edgett/PalmHill.BlazorChat — Reference implementation; WASM + WebAPI + real-time LLM
- https://dev.to/janhjordie/real-blazor-webassembly-production-pitfalls-3hmf — IL trimming, JS interop, release-only failures
- https://github.com/dotnet/aspnetcore/issues/43098 — StateHasChanged not firing with IAsyncEnumerable
Tertiary (MEDIUM-LOW confidence, validate before use)
- https://dev.to/kazinix/blazor-web-app-webassembly-hosted-in-net8-and-net9-1k6g — Hosted template removal in .NET 8+ (single community source; cross-checked against official docs)
- https://chandradev819.wordpress.com/2025/12/17/10-blazor-coding-mistakes-i-see-in-real-projects-and-how-to-avoid-them/ — Fat component patterns, DI misuse
- https://www.nuget.org/packages/Markdown.ColorCode — WASM base package compatibility note (community-reported; verify during Phase 5 implementation)
Research completed: 2026-03-27 Ready for roadmap: yes