fix: update export bundle for Azure OpenAI and add streaming diagnostics

Replace CLIProxyAPI/local proxy references with Azure OpenAI using
DefaultAzureCredential and tenant ID auth. Add Critical Pattern #8
for SSE buffering diagnostics with timestamped curl test. Add
streaming verification tasks (T6b, T15) and troubleshooting entries
for Azure AD auth, RBAC, response compression, and proxy buffering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
local
2026-04-07 01:42:38 +01:00
parent d46b179221
commit 956ec243c5
3 changed files with 256 additions and 59 deletions

View File

@@ -62,7 +62,7 @@ This feature automates extraction with AI + tool-calling validation, reducing er
## Scope
- New page at /nlxva-pricer, new MudNavLink in existing NavMenu
- New controller with 2 endpoints (chat + extract), same SSE streaming contract
- Semantic Kernel integration with OpenAI-compatible proxy
- Semantic Kernel integration with Azure OpenAI (Azure AD auth via tenant ID)
- Few-shot prompting infrastructure (instruction template + 3 examples)
- External API clients for counterparty/trade/currency validation
- Client-side markdown rendering with XSS sanitization
@@ -89,6 +89,14 @@ This feature automates extraction with AI + tool-calling validation, reducing er
With raw HttpClient we'd need to manually parse tool-call JSON, dispatch functions,
and feed results back. SK handles this loop automatically via FunctionChoiceBehavior.Auto().
## Architecture Decision: Azure OpenAI with DefaultAzureCredential
**Why:** The sandbox environment uses Azure OpenAI with an Azure AD tenant ID.
SK's `AddAzureOpenAIChatCompletion()` with `DefaultAzureCredential` integrates
with CRC's existing Azure AD auth. No API keys to manage — uses the developer's
`az login` token locally and managed identity in production. The endpoint URL
does NOT need `/v1` (Azure SDK constructs the path internally).
## Architecture Decision: SSE streaming over WebSocket
**Why:** SSE is simpler (unidirectional server→client), works through HTTP proxies,
@@ -133,10 +141,25 @@ in the Razor component. CRC's existing Fluxor infrastructure is untouched.
CRC.Server may need its CORS policy updated to allow SSE streaming (Content-Type: text/event-stream)
to the CRC.Client origin. Verify existing policy covers this.
## Risk: SSE response buffering
Two streaming hops: Azure OpenAI → CRC.Server → Browser. Buffering at any point kills
streaming UX. Common culprits: response compression middleware (`UseResponseCompression()`),
reverse proxies (NGINX, IIS), Azure API Management in front of Azure OpenAI.
Use the diagnostic stream-test endpoint (see Critical Pattern #8 in reference spec)
to verify both hops stream correctly before building the UI.
## Risk: Semantic Kernel version compatibility
CRC targets .NET 8.0. Ensure the SK NuGet package version is compatible with .NET 8.
Current stable SK packages support .NET 8+.
Current stable SK packages support .NET 8+. Also need `Microsoft.SemanticKernel.Connectors.AzureOpenAI`
and `Azure.Identity` packages.
## Risk: Azure AD token acquisition
`DefaultAzureCredential` tries multiple auth methods in sequence. On a developer machine,
it uses Azure CLI login (`az login --tenant <tenant-id>`). If the developer hasn't run
`az login`, SK will fail with an auth error at the first LLM call, not at startup.
## Risk: Large file uploads
@@ -153,7 +176,7 @@ Email HTML files are read entirely into memory (max 10MB guard). For typical sal
## Phase 1: Foundation (Server)
- [ ] **T1: Add NuGet packages** — Add `Microsoft.SemanticKernel` to CRC.Server. Add `Markdig` 1.1.1 to CRC.Client (if not already present). Verify .NET 8 compatibility.
- [ ] **T1: Add NuGet packages** — Add `Microsoft.SemanticKernel`, `Microsoft.SemanticKernel.Connectors.AzureOpenAI`, and `Azure.Identity` to CRC.Server. Add `Markdig` 1.1.1 to CRC.Client (if not already present). CRC may already have `Azure.Identity` — check first. Verify .NET 8 compatibility.
- [ ] **T2: Add shared DTOs** — Create in CRC.Shared: `NlxvaChatMessage`, `NlxvaChatRequest`, `NlxvaModelSettings`, `NlxvaExtractionRequest`, `NlxvaExtractionResult`, `TradeItem` (with `[JsonPropertyName]` snake_case), `NlxvaValidationResult`, `NlxvaCandidateMatch`. See Contracts section in reference spec for exact shapes.
@@ -163,7 +186,9 @@ Email HTML files are read entirely into memory (max 10MB guard). For typical sal
- [ ] **T5: Add FewShotService** — Create in CRC.Server/Services: `FewShotService` that loads instruction template + few-shot examples from disk. Caches ChatHistory prefix. Methods: `CloneWithEmail()`, `CloneWithEmailAndMessages()`. Register as Singleton. Copy examples/ folder to CRC.Server root.
- [ ] **T6: Register Semantic Kernel** — In CRC.Server DI: `AddOpenAIChatCompletion()` + `AddKernel()`. Base URL MUST include `/v1`. Config from `NlxvaPricer:*` keys in appsettings.json. See Critical Pattern #2.
- [ ] **T6: Register Semantic Kernel** — In CRC.Server DI: `AddAzureOpenAIChatCompletion()` with `DefaultAzureCredential` (tenant ID from config) + `AddKernel()`. Endpoint is Azure OpenAI resource URL (NO `/v1`). Use deployment name, NOT model name. Config from `NlxvaPricer:*` keys in appsettings.json. See Critical Pattern #2.
- [ ] **T6b: Verify streaming hop 1** — Add temporary `stream-test` diagnostic endpoint (see Critical Pattern #8). Run `curl -N` against it. Verify timestamps are spread across seconds (not clustered). Check for response compression middleware interference. Remove diagnostic endpoint after verification.
- [ ] **T7: Add NlxvaPricerController** — Create controller with `POST /api/nlxva-pricer/chat` and `POST /api/nlxva-pricer/extract`. Both stream SSE. Chat endpoint: builds ChatHistory from messages + optional system prompt + model settings. Extract endpoint: uses FewShotService prefix. Both import ExtractionPlugin per-request and enable `FunctionChoiceBehavior.Auto()`. See Critical Pattern #6.
@@ -183,9 +208,11 @@ Email HTML files are read entirely into memory (max 10MB guard). For typical sal
## Phase 3: Verify
- [ ] **T14: Config** — Add `NlxvaPricer` and `ExternalApis` sections to CRC.Server appsettings.json. Ensure CORS allows CRC.Client origin for SSE responses.
- [ ] **T14: Config** — Add `NlxvaPricer` (AzureOpenAIEndpoint, DeploymentName, TenantId, FewShotPath) and `ExternalApis` sections to CRC.Server appsettings.json. Ensure CORS allows CRC.Client origin for SSE responses. Ensure developer has run `az login --tenant <tenant-id>`.
- [ ] **T15: Smoke test** — Build both projects. Navigate to /nlxva-pricer. Send a chat message → verify streaming. Upload an example email HTML → verify extraction streams. Verify New Chat resets. Verify drag-drop visual feedback.
- [ ] **T15: Verify streaming end-to-end** — Run `curl -N` against `/api/nlxva-pricer/chat` to verify hop 2 (server → client) streams correctly. Check browser Network tab EventStream view for incremental token delivery. If response compression is enabled, verify SSE endpoints opt out.
- [ ] **T16: Smoke test** — Build both projects. Navigate to /nlxva-pricer. Send a chat message → verify streaming tokens appear incrementally. Upload an example email HTML → verify extraction streams. Verify New Chat resets. Verify drag-drop visual feedback.
## Implementation Notes

View File

@@ -24,9 +24,9 @@ You have three companion documents for this port:
The Natural Language XVA Pricer is a chat-based interface that lets the CVA desk interact with an AI agent to price trades using natural language. It serves two modes: **general chat** (ask questions about XVA pricing, get explanations) and **email extraction** (upload a sales email, get structured trade data back as JSON).
The data flows like this: The user types a message or drops an email `.html` file onto the chat area. The Blazor WASM client sends the request to the ASP.NET Core backend via HTTP POST. The backend processes it through **Microsoft Semantic Kernel** — an AI orchestration framework that connects to an OpenAI-compatible LLM proxy (CLIProxyAPI running locally). For extraction requests, the backend prepends **few-shot examples** (real email → expected JSON pairs loaded from disk) to teach the model the expected output format. The LLM can autonomously call **validation tools** (counterparty lookup, trade ID validation, currency validation, schema validation) via SK's automatic function calling. The response streams back token-by-token as **Server-Sent Events (SSE)**, and the client renders each token into the chat UI with **markdown formatting** and **XSS sanitization**.
The data flows like this: The user types a message or drops an email `.html` file onto the chat area. The Blazor WASM client sends the request to the ASP.NET Core backend via HTTP POST. The backend processes it through **Microsoft Semantic Kernel** — an AI orchestration framework that connects to **Azure OpenAI** using Azure AD authentication (the same tenant CRC already uses). For extraction requests, the backend prepends **few-shot examples** (real email → expected JSON pairs loaded from disk) to teach the model the expected output format. The LLM can autonomously call **validation tools** (counterparty lookup, trade ID validation, currency validation, schema validation) via SK's automatic function calling. The response streams back token-by-token as **Server-Sent Events (SSE)**, and the client renders each token into the chat UI with **markdown formatting** and **XSS sanitization**.
The external dependencies are: (1) a CLIProxyAPI proxy for LLM access (any OpenAI-compatible endpoint works), (2) three external APIs for validation (counterparty, trade, currency) — these are the existing CRC backend services that CRC.Server already integrates with, and (3) the `Markdig` NuGet package for markdown rendering plus `Microsoft.SemanticKernel` for LLM orchestration.
The external dependencies are: (1) an Azure OpenAI resource with a deployed model (authenticated via Azure AD tenant ID), (2) three external APIs for validation (counterparty, trade, currency) — these are the existing CRC backend services that CRC.Server already integrates with, and (3) the `Markdig` NuGet package for markdown rendering plus `Microsoft.SemanticKernel` for LLM orchestration.
**The one thing you must understand**: this feature is an isolated page. It doesn't need Fluxor, doesn't modify CRC's data layer, and doesn't touch the Pricer/MarketData/XVA/Sales pages. It adds a controller, some services, a page, and a nav link. If something goes wrong during porting, the blast radius is limited to the new files.
@@ -34,20 +34,25 @@ The external dependencies are: (1) a CLIProxyAPI proxy for LLM access (any OpenA
## Design Decisions (Detailed)
### 1. Semantic Kernel over raw HttpClient for LLM communication
### 1. Semantic Kernel with Azure OpenAI for LLM communication
**What we chose:** Microsoft Semantic Kernel (SK) as the AI orchestration layer.
**What we chose:** Microsoft Semantic Kernel (SK) as the AI orchestration layer, connecting to Azure OpenAI via Azure AD authentication.
**Why:** The core value isn't just chat — it's the **extraction agent loop**. The agent extracts trade data, calls validation tools, interprets results, retries with fixes, and escalates to the user. Without SK, you'd need to: (a) manually parse the LLM's tool-call JSON from the streaming response, (b) dispatch to the correct C# function, (c) serialize the result, (d) feed it back to the LLM, (e) handle the loop termination. SK does all of this with one line: `FunctionChoiceBehavior.Auto()`. It turns ~200 lines of manual orchestration into zero.
Azure OpenAI is the LLM backend because CRC's sandbox environment provides it with an Azure AD tenant. `DefaultAzureCredential` integrates with CRC's existing Azure AD auth — no separate API keys to manage. On a developer's machine it uses the `az login` token; in production it can use managed identity.
**What we rejected:**
- **Raw HttpClient + manual SSE parsing** — This was the original Phase 2 approach. It works for simple chat but doesn't support tool calling without writing a full agent loop. Rejected when we added extraction tools.
- **LangChain/.NET equivalent** — Considered briefly. SK is Microsoft's official offering, has first-class .NET support, and integrates cleanly with ASP.NET Core DI. LangChain's .NET port was less mature.
- **Azure OpenAI Service directly** — CRC's network may not allow direct Azure OpenAI access from the server. CLIProxyAPI acts as a local proxy, and SK's OpenAI connector targets any OpenAI-compatible endpoint.
- **OpenAI direct (non-Azure)** — CRC's network may not allow direct OpenAI access. Azure OpenAI is within the corporate Azure tenant, which is already permitted.
- **API key auth** — Simpler to configure but keys need rotation and secure storage. Azure AD tokens are automatic and tied to the developer/service identity.
**When you'd revisit this:** If CRC moves to Azure OpenAI with managed identity auth, you'd swap `AddOpenAIChatCompletion()` for `AddAzureOpenAIChatCompletion()`. SK makes this a one-line change.
**When you'd revisit this:** If the Azure OpenAI resource is decommissioned or you need a different model provider, swap `AddAzureOpenAIChatCompletion()` for `AddOpenAIChatCompletion()` SK abstracts the difference. Everything downstream (controller, plugins, streaming) stays identical.
**Target adaptation:** CRC uses Scrutor for assembly scanning. SK's `AddKernel()` and `AddOpenAIChatCompletion()` are explicit registrations that coexist with Scrutor — no conflict. But verify that Scrutor doesn't auto-register ExtractionPlugin before your manual `AddScoped<ExtractionPlugin>()` call (it could if it scans the Plugins namespace). If it does, you'll get the plugin registered without its HttpClient dependencies. Check by looking at CRC's Scrutor scan filters.
**Target adaptation:** CRC uses Scrutor for assembly scanning. SK's `AddKernel()` and `AddAzureOpenAIChatCompletion()` are explicit registrations that coexist with Scrutor — no conflict. But verify that Scrutor doesn't auto-register ExtractionPlugin before your manual `AddScoped<ExtractionPlugin>()` call (it could if it scans the Plugins namespace). If it does, you'll get the plugin registered without its HttpClient dependencies. Check by looking at CRC's Scrutor scan filters.
**Azure AD prerequisite:** Developers must run `az login --tenant <tenant-id>` before starting CRC.Server. `DefaultAzureCredential` will silently fail at the first LLM call (not at startup) if the token isn't available — the error message mentions "ManagedIdentityCredential" and "EnvironmentCredential" failures, which can be confusing. The fix is always `az login`.
---
@@ -200,11 +205,14 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
**Step-by-step:**
1. Add `Microsoft.SemanticKernel` to `CRC.Server.csproj`
2. Add `Markdig` to `CRC.Client.csproj` (check if it's already there: `grep -i markdig CRC.Client.csproj`)
3. Run `dotnet restore CRC.sln`
2. Add `Microsoft.SemanticKernel.Connectors.AzureOpenAI` to `CRC.Server.csproj`
3. Add `Azure.Identity` to `CRC.Server.csproj` (check first: `grep -i Azure.Identity CRC.Server.csproj` — CRC may already have it since it uses Azure AD)
4. Add `Markdig` to `CRC.Client.csproj` (check if it's already there: `grep -i markdig CRC.Client.csproj`)
5. Run `dotnet restore CRC.sln`
**Expected friction on target:**
- **GV Artifactory may not have `Microsoft.SemanticKernel`**. SK is a relatively new package. If it's not mirrored in the internal feed, you'll need to either: request it be added to Artifactory, or temporarily add nuget.org as a source in `nuget.config` (check with your team if this is allowed).
- **Azure.Identity version conflict**. If CRC already has `Azure.Identity` at a different version, the SK transitive dependency may conflict. Run `dotnet list CRC.Server package --include-transitive | grep Azure.Identity` to check.
- **Version pinning**. CRC uses `RestorePackagesWithLockFile=true` — after installing, commit the updated `packages.lock.json`.
**Verify it works:**
@@ -348,37 +356,101 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
---
### T6: Register Semantic Kernel
### T6: Register Semantic Kernel with Azure OpenAI
**Prerequisites:** T1 (NuGet package installed).
**Prerequisites:** T1 (NuGet packages installed). Developer has run `az login --tenant <tenant-id>`.
**Context:** This registers the SK Kernel and OpenAI chat completion connector in DI. The connector works with any OpenAI-compatible API, so we point it at CLIProxyAPI (a local proxy that routes to Claude/GPT).
**Context:** This registers the SK Kernel and Azure OpenAI chat completion connector in DI. Unlike the source project (which used a local proxy), the CRC sandbox uses Azure OpenAI with Azure AD authentication. The key differences: use `AddAzureOpenAIChatCompletion()` (not `AddOpenAIChatCompletion()`), use deployment name (not model name), endpoint has NO `/v1` suffix, and auth uses `DefaultAzureCredential` with the tenant ID.
**Step-by-step:**
1. Add `using Microsoft.SemanticKernel;` to the startup file
2. Read config values from `NlxvaPricer:*` section
3. Register: `AddOpenAIChatCompletion()` then `AddKernel()`
4. The base URL **MUST** include `/v1` — this is the most common misconfiguration
1. Add `using Microsoft.SemanticKernel;` and `using Azure.Identity;` to the startup file
2. Read config values from `NlxvaPricer:*` section (AzureOpenAIEndpoint, DeploymentName, TenantId)
3. Register: `AddAzureOpenAIChatCompletion()` then `AddKernel()`
4. The endpoint is the Azure resource URL — do NOT add `/v1` (the Azure SDK handles path construction)
5. Use `DefaultAzureCredential` with the tenant ID
```csharp
var azureEndpoint = builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"];
var deploymentName = builder.Configuration["NlxvaPricer:DeploymentName"];
var tenantId = builder.Configuration["NlxvaPricer:TenantId"];
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: deploymentName,
endpoint: azureEndpoint,
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions { TenantId = tenantId }));
builder.Services.AddKernel();
```
**Expected friction on target:**
- **CLIProxyAPI availability**: The proxy must be running on the target machine at the configured URL. If CRC's server runs on a different machine than the developer's laptop (where CLIProxyAPI runs), you'll need network routing or to deploy CLIProxyAPI alongside CRC.
- **API key**: CLIProxyAPI may not check the key, but the SK OpenAI connector requires a non-empty string. Use `"not-needed"` as a placeholder.
- **`az login` not done**: `DefaultAzureCredential` tries multiple auth methods in sequence (environment vars → managed identity → Visual Studio → Azure CLI → etc.). On a developer machine, it relies on Azure CLI. If the developer hasn't run `az login --tenant <tenant-id>`, the error at runtime will be a confusing `CredentialUnavailableException` listing all the methods it tried. The fix is always: `az login --tenant <tenant-id>`.
- **Deployment name vs model name**: In Azure portal, you deploy a model (e.g., `gpt-4o`) and give the deployment a name (e.g., `gpt4o-prod`). You pass the **deployment name** to SK, not the model name. Ask your Azure admin for the deployment name.
- **Azure RBAC permissions**: The developer's Azure AD identity needs the "Cognitive Services OpenAI User" role on the Azure OpenAI resource. Without it, you'll get a 403.
**Verify it works:**
- `dotnet build` succeeds (SK NuGet resolved correctly)
- `dotnet build` succeeds
- At runtime: inject `Kernel` into a test controller and verify it resolves
- Quick smoke test: call `kernel.GetRequiredService<IChatCompletionService>()` — should not throw
- Quick smoke test: `kernel.GetRequiredService<IChatCompletionService>()` — should not throw
- Full test: the diagnostic stream-test endpoint (see T6b below)
**If it breaks — diagnostic checklist:**
- Symptom: 404 on LLM requests
Cause: Base URL missing `/v1`
Fix: Change `http://localhost:8317` to `http://localhost:8317/v1`
- Symptom: `HttpRequestException: Connection refused`
Cause: CLIProxyAPI not running
Fix: Start CLIProxyAPI on the target machine, verify with `curl http://localhost:8317/v1/models`
- Symptom: `CredentialUnavailableException` with "DefaultAzureCredential failed to retrieve a token"
Cause: Developer not logged in to Azure CLI
Fix: Run `az login --tenant <tenant-id>`, then restart CRC.Server
- Symptom: HTTP 403 Forbidden from Azure OpenAI
Cause: Azure AD identity lacks "Cognitive Services OpenAI User" role
Fix: Ask Azure admin to grant the role on the Azure OpenAI resource
- Symptom: HTTP 404 on Azure OpenAI endpoint
Cause: Wrong deployment name, or deployment doesn't exist
Fix: Verify deployment name in Azure portal → Azure OpenAI → Deployments
- Symptom: `InvalidOperationException: No service for type IChatCompletionService`
Cause: `AddOpenAIChatCompletion()` not called before `AddKernel()`
Fix: Ensure registration order: OpenAIChatCompletion first, then Kernel
Cause: `AddAzureOpenAIChatCompletion()` not called before `AddKernel()`
Fix: Ensure registration order: AzureOpenAIChatCompletion first, then Kernel
---
### T6b: Verify streaming hop 1 (Azure OpenAI → CRC.Server)
**Prerequisites:** T6 (SK registered), T7 (controller exists — or add the diagnostic endpoint to any controller temporarily).
**Context:** Before building the full UI, verify that tokens actually stream from Azure OpenAI through CRC.Server. This catches buffering issues early (response compression middleware, Azure API Management, corporate proxies).
**Step-by-step:**
1. Add a temporary diagnostic endpoint to NlxvaPricerController (see Critical Pattern #8 in export-spec)
2. Run: `curl -N https://localhost:7100/api/nlxva-pricer/stream-test`
3. Watch the timestamps in the output
**What correct streaming looks like:**
```
data: [450ms] 1 ← timestamps spread across seconds
data: [620ms]
data: [780ms] 2
data: [950ms]
data: [1100ms] 3
```
**What buffered streaming looks like:**
```
data: [8200ms] 1 ← all timestamps clustered at the end
data: [8201ms]
data: [8202ms] 2
data: [8203ms]
```
**If buffered — check these in order:**
1. **Response compression middleware**: If CRC.Server has `app.UseResponseCompression()`, it buffers SSE to compress. Add `Response.Headers["Content-Encoding"] = "identity";` in the controller to opt out.
2. **Azure API Management (APIM)**: If APIM sits in front of the Azure OpenAI resource, it buffers by default. Need `forward-request` policy with `buffer-response="false"`.
3. **Corporate HTTPS proxy**: Check `echo $HTTPS_PROXY` on the server. May need proxy bypass for `*.openai.azure.com`.
4. **IIS**: If CRC runs under IIS, add `responseBufferLimit="0"` in web.config.
**Always set these headers on SSE endpoints:**
```csharp
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // prevents NGINX buffering
```
5. Remove the diagnostic endpoint after verification.
---
@@ -561,10 +633,11 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
**Step-by-step checklist:**
- [ ] `NlxvaPricer:LlmBaseUrl` in CRC.Server `appsettings.json` — default `http://localhost:8317/v1`
- [ ] `NlxvaPricer:LlmModel` in CRC.Server `appsettings.json` — default `claude-sonnet-4-6`
- [ ] `NlxvaPricer:LlmApiKey` in CRC.Server `appsettings.json` — default `not-needed`
- [ ] `NlxvaPricer:AzureOpenAIEndpoint` in CRC.Server `appsettings.json` — e.g., `https://your-resource.openai.azure.com/` — **no `/v1`**. What happens if missing: SK registration fails at startup
- [ ] `NlxvaPricer:DeploymentName` in CRC.Server `appsettings.json` — the Azure deployment name (not model name). Get from Azure portal → Azure OpenAI → Deployments
- [ ] `NlxvaPricer:TenantId` in CRC.Server `appsettings.json` — Azure AD tenant ID. Same tenant CRC uses for Microsoft.Identity.Web auth
- [ ] `NlxvaPricer:FewShotPath` in CRC.Server `appsettings.json` — default `examples/extraction`
- [ ] Developer has run `az login --tenant <tenant-id>` — `DefaultAzureCredential` needs this. Failure shows at first LLM call, not at startup
- [ ] `ExternalApis:CounterpartyBaseUrl` — default `http://localhost:5000/api/counterparty` (or use CRC's existing)
- [ ] `ExternalApis:TradeBaseUrl` — default `http://localhost:5000/api/trade` (or use CRC's existing)
- [ ] `ExternalApis:CurrencyBaseUrl` — default `http://localhost:5000/api/currency` (or use CRC's existing)
@@ -579,7 +652,7 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
**Full verification sequence:**
1. `dotnet build --configuration release CRC.sln` — 0 errors, 0 new warnings
2. Start CLIProxyAPI on target machine
2. Ensure developer has run `az login --tenant <tenant-id>`
3. Start CRC.Server
4. Navigate to CRC.Client in browser
5. Verify "NL XVA Pricer" appears in sidebar
@@ -598,7 +671,7 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
| # | Symptom | Likely Cause | Fix |
|---|---|---|---|
| 1 | 404 on `/v1/chat/completions` | Base URL missing `/v1` suffix | Set `NlxvaPricer:LlmBaseUrl` to `http://localhost:8317/v1` |
| 1 | 404 on Azure OpenAI endpoint | Wrong deployment name or endpoint URL | Verify deployment name in Azure portal; endpoint should be `https://<resource>.openai.azure.com/` with NO `/v1` |
| 2 | CORS 403 in browser console | CORS policy doesn't cover CRC.Client origin or `text/event-stream` | Add CRC.Client origin with `AllowAnyHeader()` in CORS config |
| 3 | No streaming — entire response at once | `SetBrowserResponseStreamingEnabled(true)` missing on client | Add to HttpRequestMessage before SendAsync |
| 4 | `NotSupportedException: Synchronous operations` | Using `reader.EndOfStream` in WASM | Replace with `while ((line = await ReadLineAsync()) != null)` |
@@ -611,7 +684,10 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
| 11 | `FileNotFoundException` for instruction-template.txt | Examples folder not at ContentRootPath | Log ContentRootPath; verify examples location; update FewShotPath config |
| 12 | Empty few-shot examples (only system message) | Subdirectory structure wrong | Verify `examples/extraction/few-shot/01/input.html` exists |
| 13 | `NuGet restore error` for SemanticKernel | Package not in GV Artifactory feed | Request mirroring or temporary nuget.org source |
| 14 | `HttpRequestException: Connection refused` | CLIProxyAPI not running | Start proxy; verify with `curl http://localhost:8317/v1/models` |
| 14 | `CredentialUnavailableException` from DefaultAzureCredential | Developer not logged in via Azure CLI | Run `az login --tenant <tenant-id>`, restart CRC.Server |
| 14b | HTTP 403 from Azure OpenAI | Azure AD identity lacks role | Grant "Cognitive Services OpenAI User" on the Azure OpenAI resource |
| 14c | All tokens arrive at once (no streaming) | Response compression or proxy buffering | Use stream-test diagnostic endpoint; check `UseResponseCompression()`; set `X-Accel-Buffering: no` header |
| 14d | Streaming works in curl but not in browser | Response compression only applied for browser Accept-Encoding | Add `Response.Headers["Content-Encoding"] = "identity"` in SSE endpoints |
| 15 | Drag-drop file not triggering extraction | `file-drop.js` not loaded | Check `<script>` tag in index.html; check browser console for JS errors |
| 16 | `window.fileDrop is undefined` | Script loaded after Blazor framework init | Move `<script>` tag before `_framework/blazor.webassembly.js` |
| 17 | `JsonException` when parsing SSE data | SSE line doesn't match expected format | Add logging for raw SSE lines; check server-side WriteSSEAsync format |
@@ -627,6 +703,20 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
- **NuGet source:** Available on nuget.org. If CRC's GV Artifactory doesn't mirror it, this is a blocker — request mirroring.
- **Size:** ~5MB total with dependencies
### Microsoft.SemanticKernel.Connectors.AzureOpenAI
- **Why needed:** Azure OpenAI-specific connector for SK (provides `AddAzureOpenAIChatCompletion()`)
- **.NET compatibility:** Same as core SK package
- **Transitive dependencies:** Pulls in `Azure.AI.OpenAI` SDK
- **NuGet source:** Same as core SK — nuget.org
- **Note:** This is separate from the core SK package. Without it, only `AddOpenAIChatCompletion()` is available (for non-Azure endpoints).
### Azure.Identity
- **Why needed:** Provides `DefaultAzureCredential` for Azure AD authentication to Azure OpenAI
- **.NET compatibility:** .NET Standard 2.0+ (compatible with everything)
- **CRC likely already has this** — it uses `Microsoft.Identity.Web` for Azure AD auth. Check `grep Azure.Identity CRC.Server.csproj`.
- **Version conflicts:** If CRC has an older version, SK may pull in a newer one. Usually compatible, but verify with `dotnet build`.
- **NuGet source:** Available on nuget.org and commonly mirrored in enterprise feeds
### Markdig (1.1.1)
- **Why needed:** Markdown → HTML conversion for rendering LLM responses
- **.NET compatibility:** .NET Standard 2.0+ (compatible with everything)
@@ -664,8 +754,10 @@ If the feature needs to be removed:
**NuGet packages to remove:**
- `Microsoft.SemanticKernel` from CRC.Server
- `Microsoft.SemanticKernel.Connectors.AzureOpenAI` from CRC.Server
- `Azure.Identity` from CRC.Server (only if not used by other CRC features — likely IS used, so leave it)
- `Markdig` from CRC.Client (if not used by other features)
**Config keys to remove:**
- `NlxvaPricer:*` section from `appsettings.json`
- `NlxvaPricer:*` section (AzureOpenAIEndpoint, DeploymentName, TenantId, FewShotPath) from `appsettings.json`
- `ExternalApis:*` section (if only used by this feature)

View File

@@ -58,6 +58,8 @@ This feature is a GUEST in CRC. Existing code, patterns, and conventions take ab
Add to `CRC.Server`:
- `Microsoft.SemanticKernel` (latest stable, >=1.x)
- `Microsoft.SemanticKernel.Connectors.AzureOpenAI` (for Azure OpenAI connector)
- `Azure.Identity` (for `DefaultAzureCredential` — CRC may already have this)
- `Markdig` 1.1.1 (if CRC.Client doesn't already have it — check first)
No new packages for CRC.Client or CRC.Shared (MudBlazor already present).
@@ -73,7 +75,7 @@ CRC.Server (ASP.NET Core)
├── NlxvaPricerController
│ ├── POST /api/nlxva-pricer/chat (general chat)
│ └── POST /api/nlxva-pricer/extract (email extraction)
│ Uses: Semantic Kernel → CLIProxyAPI (OpenAI-compatible proxy)
│ Uses: Semantic Kernel → Azure OpenAI (via DefaultAzureCredential)
│ Uses: ExtractionPlugin (tool calling)
│ Uses: FewShotService (example loading)
├── Services/
@@ -232,9 +234,9 @@ data: {"error":"message"}\n\n ← on failure (followed by [DONE])
```json
{
"NlxvaPricer": {
"LlmBaseUrl": "http://localhost:8317/v1",
"LlmModel": "claude-sonnet-4-6",
"LlmApiKey": "not-needed",
"AzureOpenAIEndpoint": "https://your-resource.openai.azure.com/",
"DeploymentName": "gpt4o-prod",
"TenantId": "<your-azure-ad-tenant-id>",
"FewShotPath": "examples/extraction"
},
"ExternalApis": {
@@ -245,6 +247,11 @@ data: {"error":"message"}\n\n ← on failure (followed by [DONE])
}
```
If using API key auth instead of Azure AD, replace `TenantId` with:
```json
"ApiKey": "<your-azure-openai-api-key>"
```
## Critical Patterns
### 1. SSE streaming in Blazor WASM — DO NOT use `reader.EndOfStream`
@@ -276,16 +283,33 @@ while ((line = await reader.ReadLineAsync()) != null) // ← NOT EndOfStream
`SetBrowserResponseStreamingEnabled(true)` is a Blazor WASM extension that tells the browser Fetch API
to expose the response as a ReadableStream. Without it, the browser buffers the entire response.
### 2. Semantic Kernel base URL must include `/v1`
### 2. Azure OpenAI: use deployment name, NOT model name; NO `/v1` suffix
**Why:** The OpenAI SDK appends `chat/completions` directly to the base URL.
Without `/v1`, requests hit `/chat/completions` instead of `/v1/chat/completions` → 404.
**Why:** Azure OpenAI uses `AddAzureOpenAIChatCompletion()`, not `AddOpenAIChatCompletion()`.
The endpoint is your Azure resource URL (no `/v1` — the Azure SDK constructs the path internally).
The `deploymentName` is the name you gave the deployment in Azure portal, not the model name.
Auth uses `DefaultAzureCredential` with the tenant ID, not an API key.
```csharp
builder.Services.AddOpenAIChatCompletion(
modelId: model,
endpoint: new Uri("http://localhost:8317/v1"), // ← MUST include /v1
apiKey: "not-needed");
using Azure.Identity;
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod",
endpoint: builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
?? "https://your-resource.openai.azure.com/",
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions
{
TenantId = builder.Configuration["NlxvaPricer:TenantId"]
}));
```
If using API key instead of Azure AD:
```csharp
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: "gpt4o-prod",
endpoint: "https://your-resource.openai.azure.com/",
apiKey: builder.Configuration["NlxvaPricer:ApiKey"]);
```
### 3. Layout height depends on AppBar height
@@ -371,18 +395,72 @@ if (!string.IsNullOrEmpty(parsedText))
yield return parsedText;
```
### 8. SSE response buffering — verify both streaming hops
**Why:** The architecture has two streaming hops: Azure OpenAI → CRC.Server → Browser.
If anything buffers in either hop, the user sees no tokens until the full response completes.
Common buffers: response compression middleware, reverse proxies (NGINX/IIS), Azure API Management.
**Diagnostic endpoint (add temporarily, remove after verifying):**
```csharp
[HttpGet("stream-test")]
public async Task StreamTest()
{
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // NGINX hint
var chatService = _kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddUserMessage("Count from 1 to 10, one number per line.");
var sw = System.Diagnostics.Stopwatch.StartNew();
await foreach (var chunk in chatService.GetStreamingChatMessageContentsAsync(history))
{
if (!string.IsNullOrEmpty(chunk.Content))
{
await Response.WriteAsync($"data: [{sw.ElapsedMilliseconds}ms] {chunk.Content}\n\n");
await Response.Body.FlushAsync();
}
}
await Response.WriteAsync("data: [DONE]\n\n");
}
```
**Test with:** `curl -N https://localhost:7100/api/nlxva-pricer/stream-test`
- Timestamps spread over seconds = streaming works
- All timestamps clustered at the end = something is buffering
**If CRC.Server uses `UseResponseCompression()`, exclude SSE:**
```csharp
Response.Headers["Content-Encoding"] = "identity"; // opt out per-response
```
**Response headers to always set on SSE endpoints:**
```csharp
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // prevents NGINX buffering
```
## Wiring
### CRC.Server DI registration order (add to existing Program.cs / Startup.cs)
```csharp
// 1. Semantic Kernel — OpenAI-compatible connector
var llmBaseUrl = builder.Configuration["NlxvaPricer:LlmBaseUrl"] ?? "http://localhost:8317/v1";
var llmModel = builder.Configuration["NlxvaPricer:LlmModel"] ?? "claude-sonnet-4-6";
builder.Services.AddOpenAIChatCompletion(
modelId: llmModel,
endpoint: new Uri(llmBaseUrl),
apiKey: builder.Configuration["NlxvaPricer:LlmApiKey"] ?? "not-needed");
// 1. Semantic Kernel — Azure OpenAI connector with Azure AD auth
using Azure.Identity;
var azureEndpoint = builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
?? "https://your-resource.openai.azure.com/";
var deploymentName = builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod";
var tenantId = builder.Configuration["NlxvaPricer:TenantId"];
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: deploymentName,
endpoint: azureEndpoint,
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions { TenantId = tenantId }));
builder.Services.AddKernel();
// 2. External API typed HttpClients