fix: update export bundle for Azure OpenAI and add streaming diagnostics
Replace CLIProxyAPI/local proxy references with Azure OpenAI using DefaultAzureCredential and tenant ID auth. Add Critical Pattern #8 for SSE buffering diagnostics with timestamped curl test. Add streaming verification tasks (T6b, T15) and troubleshooting entries for Azure AD auth, RBAC, response compression, and proxy buffering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -62,7 +62,7 @@ This feature automates extraction with AI + tool-calling validation, reducing er
|
|||||||
## Scope
|
## Scope
|
||||||
- New page at /nlxva-pricer, new MudNavLink in existing NavMenu
|
- New page at /nlxva-pricer, new MudNavLink in existing NavMenu
|
||||||
- New controller with 2 endpoints (chat + extract), same SSE streaming contract
|
- New controller with 2 endpoints (chat + extract), same SSE streaming contract
|
||||||
- Semantic Kernel integration with OpenAI-compatible proxy
|
- Semantic Kernel integration with Azure OpenAI (Azure AD auth via tenant ID)
|
||||||
- Few-shot prompting infrastructure (instruction template + 3 examples)
|
- Few-shot prompting infrastructure (instruction template + 3 examples)
|
||||||
- External API clients for counterparty/trade/currency validation
|
- External API clients for counterparty/trade/currency validation
|
||||||
- Client-side markdown rendering with XSS sanitization
|
- Client-side markdown rendering with XSS sanitization
|
||||||
@@ -89,6 +89,14 @@ This feature automates extraction with AI + tool-calling validation, reducing er
|
|||||||
With raw HttpClient we'd need to manually parse tool-call JSON, dispatch functions,
|
With raw HttpClient we'd need to manually parse tool-call JSON, dispatch functions,
|
||||||
and feed results back. SK handles this loop automatically via FunctionChoiceBehavior.Auto().
|
and feed results back. SK handles this loop automatically via FunctionChoiceBehavior.Auto().
|
||||||
|
|
||||||
|
## Architecture Decision: Azure OpenAI with DefaultAzureCredential
|
||||||
|
|
||||||
|
**Why:** The sandbox environment uses Azure OpenAI with an Azure AD tenant ID.
|
||||||
|
SK's `AddAzureOpenAIChatCompletion()` with `DefaultAzureCredential` integrates
|
||||||
|
with CRC's existing Azure AD auth. No API keys to manage — uses the developer's
|
||||||
|
`az login` token locally and managed identity in production. The endpoint URL
|
||||||
|
does NOT need `/v1` (Azure SDK constructs the path internally).
|
||||||
|
|
||||||
## Architecture Decision: SSE streaming over WebSocket
|
## Architecture Decision: SSE streaming over WebSocket
|
||||||
|
|
||||||
**Why:** SSE is simpler (unidirectional server→client), works through HTTP proxies,
|
**Why:** SSE is simpler (unidirectional server→client), works through HTTP proxies,
|
||||||
@@ -133,10 +141,25 @@ in the Razor component. CRC's existing Fluxor infrastructure is untouched.
|
|||||||
CRC.Server may need its CORS policy updated to allow SSE streaming (Content-Type: text/event-stream)
|
CRC.Server may need its CORS policy updated to allow SSE streaming (Content-Type: text/event-stream)
|
||||||
to the CRC.Client origin. Verify existing policy covers this.
|
to the CRC.Client origin. Verify existing policy covers this.
|
||||||
|
|
||||||
|
## Risk: SSE response buffering
|
||||||
|
|
||||||
|
Two streaming hops: Azure OpenAI → CRC.Server → Browser. Buffering at any point kills
|
||||||
|
streaming UX. Common culprits: response compression middleware (`UseResponseCompression()`),
|
||||||
|
reverse proxies (NGINX, IIS), Azure API Management in front of Azure OpenAI.
|
||||||
|
Use the diagnostic stream-test endpoint (see Critical Pattern #8 in reference spec)
|
||||||
|
to verify both hops stream correctly before building the UI.
|
||||||
|
|
||||||
## Risk: Semantic Kernel version compatibility
|
## Risk: Semantic Kernel version compatibility
|
||||||
|
|
||||||
CRC targets .NET 8.0. Ensure the SK NuGet package version is compatible with .NET 8.
|
CRC targets .NET 8.0. Ensure the SK NuGet package version is compatible with .NET 8.
|
||||||
Current stable SK packages support .NET 8+.
|
Current stable SK packages support .NET 8+. Also need `Microsoft.SemanticKernel.Connectors.AzureOpenAI`
|
||||||
|
and `Azure.Identity` packages.
|
||||||
|
|
||||||
|
## Risk: Azure AD token acquisition
|
||||||
|
|
||||||
|
`DefaultAzureCredential` tries multiple auth methods in sequence. On a developer machine,
|
||||||
|
it uses Azure CLI login (`az login --tenant <tenant-id>`). If the developer hasn't run
|
||||||
|
`az login`, SK will fail with an auth error at the first LLM call, not at startup.
|
||||||
|
|
||||||
## Risk: Large file uploads
|
## Risk: Large file uploads
|
||||||
|
|
||||||
@@ -153,7 +176,7 @@ Email HTML files are read entirely into memory (max 10MB guard). For typical sal
|
|||||||
|
|
||||||
## Phase 1: Foundation (Server)
|
## Phase 1: Foundation (Server)
|
||||||
|
|
||||||
- [ ] **T1: Add NuGet packages** — Add `Microsoft.SemanticKernel` to CRC.Server. Add `Markdig` 1.1.1 to CRC.Client (if not already present). Verify .NET 8 compatibility.
|
- [ ] **T1: Add NuGet packages** — Add `Microsoft.SemanticKernel`, `Microsoft.SemanticKernel.Connectors.AzureOpenAI`, and `Azure.Identity` to CRC.Server. Add `Markdig` 1.1.1 to CRC.Client (if not already present). CRC may already have `Azure.Identity` — check first. Verify .NET 8 compatibility.
|
||||||
|
|
||||||
- [ ] **T2: Add shared DTOs** — Create in CRC.Shared: `NlxvaChatMessage`, `NlxvaChatRequest`, `NlxvaModelSettings`, `NlxvaExtractionRequest`, `NlxvaExtractionResult`, `TradeItem` (with `[JsonPropertyName]` snake_case), `NlxvaValidationResult`, `NlxvaCandidateMatch`. See Contracts section in reference spec for exact shapes.
|
- [ ] **T2: Add shared DTOs** — Create in CRC.Shared: `NlxvaChatMessage`, `NlxvaChatRequest`, `NlxvaModelSettings`, `NlxvaExtractionRequest`, `NlxvaExtractionResult`, `TradeItem` (with `[JsonPropertyName]` snake_case), `NlxvaValidationResult`, `NlxvaCandidateMatch`. See Contracts section in reference spec for exact shapes.
|
||||||
|
|
||||||
@@ -163,7 +186,9 @@ Email HTML files are read entirely into memory (max 10MB guard). For typical sal
|
|||||||
|
|
||||||
- [ ] **T5: Add FewShotService** — Create in CRC.Server/Services: `FewShotService` that loads instruction template + few-shot examples from disk. Caches ChatHistory prefix. Methods: `CloneWithEmail()`, `CloneWithEmailAndMessages()`. Register as Singleton. Copy examples/ folder to CRC.Server root.
|
- [ ] **T5: Add FewShotService** — Create in CRC.Server/Services: `FewShotService` that loads instruction template + few-shot examples from disk. Caches ChatHistory prefix. Methods: `CloneWithEmail()`, `CloneWithEmailAndMessages()`. Register as Singleton. Copy examples/ folder to CRC.Server root.
|
||||||
|
|
||||||
- [ ] **T6: Register Semantic Kernel** — In CRC.Server DI: `AddOpenAIChatCompletion()` + `AddKernel()`. Base URL MUST include `/v1`. Config from `NlxvaPricer:*` keys in appsettings.json. See Critical Pattern #2.
|
- [ ] **T6: Register Semantic Kernel** — In CRC.Server DI: `AddAzureOpenAIChatCompletion()` with `DefaultAzureCredential` (tenant ID from config) + `AddKernel()`. Endpoint is Azure OpenAI resource URL (NO `/v1`). Use deployment name, NOT model name. Config from `NlxvaPricer:*` keys in appsettings.json. See Critical Pattern #2.
|
||||||
|
|
||||||
|
- [ ] **T6b: Verify streaming hop 1** — Add temporary `stream-test` diagnostic endpoint (see Critical Pattern #8). Run `curl -N` against it. Verify timestamps are spread across seconds (not clustered). Check for response compression middleware interference. Remove diagnostic endpoint after verification.
|
||||||
|
|
||||||
- [ ] **T7: Add NlxvaPricerController** — Create controller with `POST /api/nlxva-pricer/chat` and `POST /api/nlxva-pricer/extract`. Both stream SSE. Chat endpoint: builds ChatHistory from messages + optional system prompt + model settings. Extract endpoint: uses FewShotService prefix. Both import ExtractionPlugin per-request and enable `FunctionChoiceBehavior.Auto()`. See Critical Pattern #6.
|
- [ ] **T7: Add NlxvaPricerController** — Create controller with `POST /api/nlxva-pricer/chat` and `POST /api/nlxva-pricer/extract`. Both stream SSE. Chat endpoint: builds ChatHistory from messages + optional system prompt + model settings. Extract endpoint: uses FewShotService prefix. Both import ExtractionPlugin per-request and enable `FunctionChoiceBehavior.Auto()`. See Critical Pattern #6.
|
||||||
|
|
||||||
@@ -183,9 +208,11 @@ Email HTML files are read entirely into memory (max 10MB guard). For typical sal
|
|||||||
|
|
||||||
## Phase 3: Verify
|
## Phase 3: Verify
|
||||||
|
|
||||||
- [ ] **T14: Config** — Add `NlxvaPricer` and `ExternalApis` sections to CRC.Server appsettings.json. Ensure CORS allows CRC.Client origin for SSE responses.
|
- [ ] **T14: Config** — Add `NlxvaPricer` (AzureOpenAIEndpoint, DeploymentName, TenantId, FewShotPath) and `ExternalApis` sections to CRC.Server appsettings.json. Ensure CORS allows CRC.Client origin for SSE responses. Ensure developer has run `az login --tenant <tenant-id>`.
|
||||||
|
|
||||||
- [ ] **T15: Smoke test** — Build both projects. Navigate to /nlxva-pricer. Send a chat message → verify streaming. Upload an example email HTML → verify extraction streams. Verify New Chat resets. Verify drag-drop visual feedback.
|
- [ ] **T15: Verify streaming end-to-end** — Run `curl -N` against `/api/nlxva-pricer/chat` to verify hop 2 (server → client) streams correctly. Check browser Network tab EventStream view for incremental token delivery. If response compression is enabled, verify SSE endpoints opt out.
|
||||||
|
|
||||||
|
- [ ] **T16: Smoke test** — Build both projects. Navigate to /nlxva-pricer. Send a chat message → verify streaming tokens appear incrementally. Upload an example email HTML → verify extraction streams. Verify New Chat resets. Verify drag-drop visual feedback.
|
||||||
|
|
||||||
## Implementation Notes
|
## Implementation Notes
|
||||||
|
|
||||||
|
|||||||
@@ -24,9 +24,9 @@ You have three companion documents for this port:
|
|||||||
|
|
||||||
The Natural Language XVA Pricer is a chat-based interface that lets the CVA desk interact with an AI agent to price trades using natural language. It serves two modes: **general chat** (ask questions about XVA pricing, get explanations) and **email extraction** (upload a sales email, get structured trade data back as JSON).
|
The Natural Language XVA Pricer is a chat-based interface that lets the CVA desk interact with an AI agent to price trades using natural language. It serves two modes: **general chat** (ask questions about XVA pricing, get explanations) and **email extraction** (upload a sales email, get structured trade data back as JSON).
|
||||||
|
|
||||||
The data flows like this: The user types a message or drops an email `.html` file onto the chat area. The Blazor WASM client sends the request to the ASP.NET Core backend via HTTP POST. The backend processes it through **Microsoft Semantic Kernel** — an AI orchestration framework that connects to an OpenAI-compatible LLM proxy (CLIProxyAPI running locally). For extraction requests, the backend prepends **few-shot examples** (real email → expected JSON pairs loaded from disk) to teach the model the expected output format. The LLM can autonomously call **validation tools** (counterparty lookup, trade ID validation, currency validation, schema validation) via SK's automatic function calling. The response streams back token-by-token as **Server-Sent Events (SSE)**, and the client renders each token into the chat UI with **markdown formatting** and **XSS sanitization**.
|
The data flows like this: The user types a message or drops an email `.html` file onto the chat area. The Blazor WASM client sends the request to the ASP.NET Core backend via HTTP POST. The backend processes it through **Microsoft Semantic Kernel** — an AI orchestration framework that connects to **Azure OpenAI** using Azure AD authentication (the same tenant CRC already uses). For extraction requests, the backend prepends **few-shot examples** (real email → expected JSON pairs loaded from disk) to teach the model the expected output format. The LLM can autonomously call **validation tools** (counterparty lookup, trade ID validation, currency validation, schema validation) via SK's automatic function calling. The response streams back token-by-token as **Server-Sent Events (SSE)**, and the client renders each token into the chat UI with **markdown formatting** and **XSS sanitization**.
|
||||||
|
|
||||||
The external dependencies are: (1) a CLIProxyAPI proxy for LLM access (any OpenAI-compatible endpoint works), (2) three external APIs for validation (counterparty, trade, currency) — these are the existing CRC backend services that CRC.Server already integrates with, and (3) the `Markdig` NuGet package for markdown rendering plus `Microsoft.SemanticKernel` for LLM orchestration.
|
The external dependencies are: (1) an Azure OpenAI resource with a deployed model (authenticated via Azure AD tenant ID), (2) three external APIs for validation (counterparty, trade, currency) — these are the existing CRC backend services that CRC.Server already integrates with, and (3) the `Markdig` NuGet package for markdown rendering plus `Microsoft.SemanticKernel` for LLM orchestration.
|
||||||
|
|
||||||
**The one thing you must understand**: this feature is an isolated page. It doesn't need Fluxor, doesn't modify CRC's data layer, and doesn't touch the Pricer/MarketData/XVA/Sales pages. It adds a controller, some services, a page, and a nav link. If something goes wrong during porting, the blast radius is limited to the new files.
|
**The one thing you must understand**: this feature is an isolated page. It doesn't need Fluxor, doesn't modify CRC's data layer, and doesn't touch the Pricer/MarketData/XVA/Sales pages. It adds a controller, some services, a page, and a nav link. If something goes wrong during porting, the blast radius is limited to the new files.
|
||||||
|
|
||||||
@@ -34,20 +34,25 @@ The external dependencies are: (1) a CLIProxyAPI proxy for LLM access (any OpenA
|
|||||||
|
|
||||||
## Design Decisions (Detailed)
|
## Design Decisions (Detailed)
|
||||||
|
|
||||||
### 1. Semantic Kernel over raw HttpClient for LLM communication
|
### 1. Semantic Kernel with Azure OpenAI for LLM communication
|
||||||
|
|
||||||
**What we chose:** Microsoft Semantic Kernel (SK) as the AI orchestration layer.
|
**What we chose:** Microsoft Semantic Kernel (SK) as the AI orchestration layer, connecting to Azure OpenAI via Azure AD authentication.
|
||||||
|
|
||||||
**Why:** The core value isn't just chat — it's the **extraction agent loop**. The agent extracts trade data, calls validation tools, interprets results, retries with fixes, and escalates to the user. Without SK, you'd need to: (a) manually parse the LLM's tool-call JSON from the streaming response, (b) dispatch to the correct C# function, (c) serialize the result, (d) feed it back to the LLM, (e) handle the loop termination. SK does all of this with one line: `FunctionChoiceBehavior.Auto()`. It turns ~200 lines of manual orchestration into zero.
|
**Why:** The core value isn't just chat — it's the **extraction agent loop**. The agent extracts trade data, calls validation tools, interprets results, retries with fixes, and escalates to the user. Without SK, you'd need to: (a) manually parse the LLM's tool-call JSON from the streaming response, (b) dispatch to the correct C# function, (c) serialize the result, (d) feed it back to the LLM, (e) handle the loop termination. SK does all of this with one line: `FunctionChoiceBehavior.Auto()`. It turns ~200 lines of manual orchestration into zero.
|
||||||
|
|
||||||
|
Azure OpenAI is the LLM backend because CRC's sandbox environment provides it with an Azure AD tenant. `DefaultAzureCredential` integrates with CRC's existing Azure AD auth — no separate API keys to manage. On a developer's machine it uses the `az login` token; in production it can use managed identity.
|
||||||
|
|
||||||
**What we rejected:**
|
**What we rejected:**
|
||||||
- **Raw HttpClient + manual SSE parsing** — This was the original Phase 2 approach. It works for simple chat but doesn't support tool calling without writing a full agent loop. Rejected when we added extraction tools.
|
- **Raw HttpClient + manual SSE parsing** — This was the original Phase 2 approach. It works for simple chat but doesn't support tool calling without writing a full agent loop. Rejected when we added extraction tools.
|
||||||
- **LangChain/.NET equivalent** — Considered briefly. SK is Microsoft's official offering, has first-class .NET support, and integrates cleanly with ASP.NET Core DI. LangChain's .NET port was less mature.
|
- **LangChain/.NET equivalent** — Considered briefly. SK is Microsoft's official offering, has first-class .NET support, and integrates cleanly with ASP.NET Core DI. LangChain's .NET port was less mature.
|
||||||
- **Azure OpenAI Service directly** — CRC's network may not allow direct Azure OpenAI access from the server. CLIProxyAPI acts as a local proxy, and SK's OpenAI connector targets any OpenAI-compatible endpoint.
|
- **OpenAI direct (non-Azure)** — CRC's network may not allow direct OpenAI access. Azure OpenAI is within the corporate Azure tenant, which is already permitted.
|
||||||
|
- **API key auth** — Simpler to configure but keys need rotation and secure storage. Azure AD tokens are automatic and tied to the developer/service identity.
|
||||||
|
|
||||||
**When you'd revisit this:** If CRC moves to Azure OpenAI with managed identity auth, you'd swap `AddOpenAIChatCompletion()` for `AddAzureOpenAIChatCompletion()`. SK makes this a one-line change.
|
**When you'd revisit this:** If the Azure OpenAI resource is decommissioned or you need a different model provider, swap `AddAzureOpenAIChatCompletion()` for `AddOpenAIChatCompletion()` — SK abstracts the difference. Everything downstream (controller, plugins, streaming) stays identical.
|
||||||
|
|
||||||
**Target adaptation:** CRC uses Scrutor for assembly scanning. SK's `AddKernel()` and `AddOpenAIChatCompletion()` are explicit registrations that coexist with Scrutor — no conflict. But verify that Scrutor doesn't auto-register ExtractionPlugin before your manual `AddScoped<ExtractionPlugin>()` call (it could if it scans the Plugins namespace). If it does, you'll get the plugin registered without its HttpClient dependencies. Check by looking at CRC's Scrutor scan filters.
|
**Target adaptation:** CRC uses Scrutor for assembly scanning. SK's `AddKernel()` and `AddAzureOpenAIChatCompletion()` are explicit registrations that coexist with Scrutor — no conflict. But verify that Scrutor doesn't auto-register ExtractionPlugin before your manual `AddScoped<ExtractionPlugin>()` call (it could if it scans the Plugins namespace). If it does, you'll get the plugin registered without its HttpClient dependencies. Check by looking at CRC's Scrutor scan filters.
|
||||||
|
|
||||||
|
**Azure AD prerequisite:** Developers must run `az login --tenant <tenant-id>` before starting CRC.Server. `DefaultAzureCredential` will silently fail at the first LLM call (not at startup) if the token isn't available — the error message mentions "ManagedIdentityCredential" and "EnvironmentCredential" failures, which can be confusing. The fix is always `az login`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -200,11 +205,14 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
|
|
||||||
**Step-by-step:**
|
**Step-by-step:**
|
||||||
1. Add `Microsoft.SemanticKernel` to `CRC.Server.csproj`
|
1. Add `Microsoft.SemanticKernel` to `CRC.Server.csproj`
|
||||||
2. Add `Markdig` to `CRC.Client.csproj` (check if it's already there: `grep -i markdig CRC.Client.csproj`)
|
2. Add `Microsoft.SemanticKernel.Connectors.AzureOpenAI` to `CRC.Server.csproj`
|
||||||
3. Run `dotnet restore CRC.sln`
|
3. Add `Azure.Identity` to `CRC.Server.csproj` (check first: `grep -i Azure.Identity CRC.Server.csproj` — CRC may already have it since it uses Azure AD)
|
||||||
|
4. Add `Markdig` to `CRC.Client.csproj` (check if it's already there: `grep -i markdig CRC.Client.csproj`)
|
||||||
|
5. Run `dotnet restore CRC.sln`
|
||||||
|
|
||||||
**Expected friction on target:**
|
**Expected friction on target:**
|
||||||
- **GV Artifactory may not have `Microsoft.SemanticKernel`**. SK is a relatively new package. If it's not mirrored in the internal feed, you'll need to either: request it be added to Artifactory, or temporarily add nuget.org as a source in `nuget.config` (check with your team if this is allowed).
|
- **GV Artifactory may not have `Microsoft.SemanticKernel`**. SK is a relatively new package. If it's not mirrored in the internal feed, you'll need to either: request it be added to Artifactory, or temporarily add nuget.org as a source in `nuget.config` (check with your team if this is allowed).
|
||||||
|
- **Azure.Identity version conflict**. If CRC already has `Azure.Identity` at a different version, the SK transitive dependency may conflict. Run `dotnet list CRC.Server package --include-transitive | grep Azure.Identity` to check.
|
||||||
- **Version pinning**. CRC uses `RestorePackagesWithLockFile=true` — after installing, commit the updated `packages.lock.json`.
|
- **Version pinning**. CRC uses `RestorePackagesWithLockFile=true` — after installing, commit the updated `packages.lock.json`.
|
||||||
|
|
||||||
**Verify it works:**
|
**Verify it works:**
|
||||||
@@ -348,37 +356,101 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### T6: Register Semantic Kernel
|
### T6: Register Semantic Kernel with Azure OpenAI
|
||||||
|
|
||||||
**Prerequisites:** T1 (NuGet package installed).
|
**Prerequisites:** T1 (NuGet packages installed). Developer has run `az login --tenant <tenant-id>`.
|
||||||
|
|
||||||
**Context:** This registers the SK Kernel and OpenAI chat completion connector in DI. The connector works with any OpenAI-compatible API, so we point it at CLIProxyAPI (a local proxy that routes to Claude/GPT).
|
**Context:** This registers the SK Kernel and Azure OpenAI chat completion connector in DI. Unlike the source project (which used a local proxy), the CRC sandbox uses Azure OpenAI with Azure AD authentication. The key differences: use `AddAzureOpenAIChatCompletion()` (not `AddOpenAIChatCompletion()`), use deployment name (not model name), endpoint has NO `/v1` suffix, and auth uses `DefaultAzureCredential` with the tenant ID.
|
||||||
|
|
||||||
**Step-by-step:**
|
**Step-by-step:**
|
||||||
1. Add `using Microsoft.SemanticKernel;` to the startup file
|
1. Add `using Microsoft.SemanticKernel;` and `using Azure.Identity;` to the startup file
|
||||||
2. Read config values from `NlxvaPricer:*` section
|
2. Read config values from `NlxvaPricer:*` section (AzureOpenAIEndpoint, DeploymentName, TenantId)
|
||||||
3. Register: `AddOpenAIChatCompletion()` then `AddKernel()`
|
3. Register: `AddAzureOpenAIChatCompletion()` then `AddKernel()`
|
||||||
4. The base URL **MUST** include `/v1` — this is the most common misconfiguration
|
4. The endpoint is the Azure resource URL — do NOT add `/v1` (the Azure SDK handles path construction)
|
||||||
|
5. Use `DefaultAzureCredential` with the tenant ID
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var azureEndpoint = builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"];
|
||||||
|
var deploymentName = builder.Configuration["NlxvaPricer:DeploymentName"];
|
||||||
|
var tenantId = builder.Configuration["NlxvaPricer:TenantId"];
|
||||||
|
|
||||||
|
builder.Services.AddAzureOpenAIChatCompletion(
|
||||||
|
deploymentName: deploymentName,
|
||||||
|
endpoint: azureEndpoint,
|
||||||
|
credentials: new DefaultAzureCredential(
|
||||||
|
new DefaultAzureCredentialOptions { TenantId = tenantId }));
|
||||||
|
builder.Services.AddKernel();
|
||||||
|
```
|
||||||
|
|
||||||
**Expected friction on target:**
|
**Expected friction on target:**
|
||||||
- **CLIProxyAPI availability**: The proxy must be running on the target machine at the configured URL. If CRC's server runs on a different machine than the developer's laptop (where CLIProxyAPI runs), you'll need network routing or to deploy CLIProxyAPI alongside CRC.
|
- **`az login` not done**: `DefaultAzureCredential` tries multiple auth methods in sequence (environment vars → managed identity → Visual Studio → Azure CLI → etc.). On a developer machine, it relies on Azure CLI. If the developer hasn't run `az login --tenant <tenant-id>`, the error at runtime will be a confusing `CredentialUnavailableException` listing all the methods it tried. The fix is always: `az login --tenant <tenant-id>`.
|
||||||
- **API key**: CLIProxyAPI may not check the key, but the SK OpenAI connector requires a non-empty string. Use `"not-needed"` as a placeholder.
|
- **Deployment name vs model name**: In Azure portal, you deploy a model (e.g., `gpt-4o`) and give the deployment a name (e.g., `gpt4o-prod`). You pass the **deployment name** to SK, not the model name. Ask your Azure admin for the deployment name.
|
||||||
|
- **Azure RBAC permissions**: The developer's Azure AD identity needs the "Cognitive Services OpenAI User" role on the Azure OpenAI resource. Without it, you'll get a 403.
|
||||||
|
|
||||||
**Verify it works:**
|
**Verify it works:**
|
||||||
- `dotnet build` succeeds (SK NuGet resolved correctly)
|
- `dotnet build` succeeds
|
||||||
- At runtime: inject `Kernel` into a test controller and verify it resolves
|
- At runtime: inject `Kernel` into a test controller and verify it resolves
|
||||||
- Quick smoke test: call `kernel.GetRequiredService<IChatCompletionService>()` — should not throw
|
- Quick smoke test: `kernel.GetRequiredService<IChatCompletionService>()` — should not throw
|
||||||
|
- Full test: the diagnostic stream-test endpoint (see T6b below)
|
||||||
|
|
||||||
**If it breaks — diagnostic checklist:**
|
**If it breaks — diagnostic checklist:**
|
||||||
- Symptom: 404 on LLM requests
|
- Symptom: `CredentialUnavailableException` with "DefaultAzureCredential failed to retrieve a token"
|
||||||
Cause: Base URL missing `/v1`
|
Cause: Developer not logged in to Azure CLI
|
||||||
Fix: Change `http://localhost:8317` to `http://localhost:8317/v1`
|
Fix: Run `az login --tenant <tenant-id>`, then restart CRC.Server
|
||||||
- Symptom: `HttpRequestException: Connection refused`
|
- Symptom: HTTP 403 Forbidden from Azure OpenAI
|
||||||
Cause: CLIProxyAPI not running
|
Cause: Azure AD identity lacks "Cognitive Services OpenAI User" role
|
||||||
Fix: Start CLIProxyAPI on the target machine, verify with `curl http://localhost:8317/v1/models`
|
Fix: Ask Azure admin to grant the role on the Azure OpenAI resource
|
||||||
|
- Symptom: HTTP 404 on Azure OpenAI endpoint
|
||||||
|
Cause: Wrong deployment name, or deployment doesn't exist
|
||||||
|
Fix: Verify deployment name in Azure portal → Azure OpenAI → Deployments
|
||||||
- Symptom: `InvalidOperationException: No service for type IChatCompletionService`
|
- Symptom: `InvalidOperationException: No service for type IChatCompletionService`
|
||||||
Cause: `AddOpenAIChatCompletion()` not called before `AddKernel()`
|
Cause: `AddAzureOpenAIChatCompletion()` not called before `AddKernel()`
|
||||||
Fix: Ensure registration order: OpenAIChatCompletion first, then Kernel
|
Fix: Ensure registration order: AzureOpenAIChatCompletion first, then Kernel
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T6b: Verify streaming hop 1 (Azure OpenAI → CRC.Server)
|
||||||
|
|
||||||
|
**Prerequisites:** T6 (SK registered), T7 (controller exists — or add the diagnostic endpoint to any controller temporarily).
|
||||||
|
|
||||||
|
**Context:** Before building the full UI, verify that tokens actually stream from Azure OpenAI through CRC.Server. This catches buffering issues early (response compression middleware, Azure API Management, corporate proxies).
|
||||||
|
|
||||||
|
**Step-by-step:**
|
||||||
|
1. Add a temporary diagnostic endpoint to NlxvaPricerController (see Critical Pattern #8 in export-spec)
|
||||||
|
2. Run: `curl -N https://localhost:7100/api/nlxva-pricer/stream-test`
|
||||||
|
3. Watch the timestamps in the output
|
||||||
|
|
||||||
|
**What correct streaming looks like:**
|
||||||
|
```
|
||||||
|
data: [450ms] 1 ← timestamps spread across seconds
|
||||||
|
data: [620ms]
|
||||||
|
data: [780ms] 2
|
||||||
|
data: [950ms]
|
||||||
|
data: [1100ms] 3
|
||||||
|
```
|
||||||
|
|
||||||
|
**What buffered streaming looks like:**
|
||||||
|
```
|
||||||
|
data: [8200ms] 1 ← all timestamps clustered at the end
|
||||||
|
data: [8201ms]
|
||||||
|
data: [8202ms] 2
|
||||||
|
data: [8203ms]
|
||||||
|
```
|
||||||
|
|
||||||
|
**If buffered — check these in order:**
|
||||||
|
1. **Response compression middleware**: If CRC.Server has `app.UseResponseCompression()`, it buffers SSE to compress. Add `Response.Headers["Content-Encoding"] = "identity";` in the controller to opt out.
|
||||||
|
2. **Azure API Management (APIM)**: If APIM sits in front of the Azure OpenAI resource, it buffers by default. Need `forward-request` policy with `buffer-response="false"`.
|
||||||
|
3. **Corporate HTTPS proxy**: Check `echo $HTTPS_PROXY` on the server. May need proxy bypass for `*.openai.azure.com`.
|
||||||
|
4. **IIS**: If CRC runs under IIS, add `responseBufferLimit="0"` in web.config.
|
||||||
|
|
||||||
|
**Always set these headers on SSE endpoints:**
|
||||||
|
```csharp
|
||||||
|
Response.ContentType = "text/event-stream";
|
||||||
|
Response.Headers["Cache-Control"] = "no-cache";
|
||||||
|
Response.Headers["X-Accel-Buffering"] = "no"; // prevents NGINX buffering
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Remove the diagnostic endpoint after verification.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -561,10 +633,11 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
|
|
||||||
**Step-by-step checklist:**
|
**Step-by-step checklist:**
|
||||||
|
|
||||||
- [ ] `NlxvaPricer:LlmBaseUrl` in CRC.Server `appsettings.json` — default `http://localhost:8317/v1`
|
- [ ] `NlxvaPricer:AzureOpenAIEndpoint` in CRC.Server `appsettings.json` — e.g., `https://your-resource.openai.azure.com/` — **no `/v1`**. What happens if missing: SK registration fails at startup
|
||||||
- [ ] `NlxvaPricer:LlmModel` in CRC.Server `appsettings.json` — default `claude-sonnet-4-6`
|
- [ ] `NlxvaPricer:DeploymentName` in CRC.Server `appsettings.json` — the Azure deployment name (not model name). Get from Azure portal → Azure OpenAI → Deployments
|
||||||
- [ ] `NlxvaPricer:LlmApiKey` in CRC.Server `appsettings.json` — default `not-needed`
|
- [ ] `NlxvaPricer:TenantId` in CRC.Server `appsettings.json` — Azure AD tenant ID. Same tenant CRC uses for Microsoft.Identity.Web auth
|
||||||
- [ ] `NlxvaPricer:FewShotPath` in CRC.Server `appsettings.json` — default `examples/extraction`
|
- [ ] `NlxvaPricer:FewShotPath` in CRC.Server `appsettings.json` — default `examples/extraction`
|
||||||
|
- [ ] Developer has run `az login --tenant <tenant-id>` — `DefaultAzureCredential` needs this. Failure shows at first LLM call, not at startup
|
||||||
- [ ] `ExternalApis:CounterpartyBaseUrl` — default `http://localhost:5000/api/counterparty` (or use CRC's existing)
|
- [ ] `ExternalApis:CounterpartyBaseUrl` — default `http://localhost:5000/api/counterparty` (or use CRC's existing)
|
||||||
- [ ] `ExternalApis:TradeBaseUrl` — default `http://localhost:5000/api/trade` (or use CRC's existing)
|
- [ ] `ExternalApis:TradeBaseUrl` — default `http://localhost:5000/api/trade` (or use CRC's existing)
|
||||||
- [ ] `ExternalApis:CurrencyBaseUrl` — default `http://localhost:5000/api/currency` (or use CRC's existing)
|
- [ ] `ExternalApis:CurrencyBaseUrl` — default `http://localhost:5000/api/currency` (or use CRC's existing)
|
||||||
@@ -579,7 +652,7 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
**Full verification sequence:**
|
**Full verification sequence:**
|
||||||
|
|
||||||
1. `dotnet build --configuration release CRC.sln` — 0 errors, 0 new warnings
|
1. `dotnet build --configuration release CRC.sln` — 0 errors, 0 new warnings
|
||||||
2. Start CLIProxyAPI on target machine
|
2. Ensure developer has run `az login --tenant <tenant-id>`
|
||||||
3. Start CRC.Server
|
3. Start CRC.Server
|
||||||
4. Navigate to CRC.Client in browser
|
4. Navigate to CRC.Client in browser
|
||||||
5. Verify "NL XVA Pricer" appears in sidebar
|
5. Verify "NL XVA Pricer" appears in sidebar
|
||||||
@@ -598,7 +671,7 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
|
|
||||||
| # | Symptom | Likely Cause | Fix |
|
| # | Symptom | Likely Cause | Fix |
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
| 1 | 404 on `/v1/chat/completions` | Base URL missing `/v1` suffix | Set `NlxvaPricer:LlmBaseUrl` to `http://localhost:8317/v1` |
|
| 1 | 404 on Azure OpenAI endpoint | Wrong deployment name or endpoint URL | Verify deployment name in Azure portal; endpoint should be `https://<resource>.openai.azure.com/` with NO `/v1` |
|
||||||
| 2 | CORS 403 in browser console | CORS policy doesn't cover CRC.Client origin or `text/event-stream` | Add CRC.Client origin with `AllowAnyHeader()` in CORS config |
|
| 2 | CORS 403 in browser console | CORS policy doesn't cover CRC.Client origin or `text/event-stream` | Add CRC.Client origin with `AllowAnyHeader()` in CORS config |
|
||||||
| 3 | No streaming — entire response at once | `SetBrowserResponseStreamingEnabled(true)` missing on client | Add to HttpRequestMessage before SendAsync |
|
| 3 | No streaming — entire response at once | `SetBrowserResponseStreamingEnabled(true)` missing on client | Add to HttpRequestMessage before SendAsync |
|
||||||
| 4 | `NotSupportedException: Synchronous operations` | Using `reader.EndOfStream` in WASM | Replace with `while ((line = await ReadLineAsync()) != null)` |
|
| 4 | `NotSupportedException: Synchronous operations` | Using `reader.EndOfStream` in WASM | Replace with `while ((line = await ReadLineAsync()) != null)` |
|
||||||
@@ -611,7 +684,10 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
| 11 | `FileNotFoundException` for instruction-template.txt | Examples folder not at ContentRootPath | Log ContentRootPath; verify examples location; update FewShotPath config |
|
| 11 | `FileNotFoundException` for instruction-template.txt | Examples folder not at ContentRootPath | Log ContentRootPath; verify examples location; update FewShotPath config |
|
||||||
| 12 | Empty few-shot examples (only system message) | Subdirectory structure wrong | Verify `examples/extraction/few-shot/01/input.html` exists |
|
| 12 | Empty few-shot examples (only system message) | Subdirectory structure wrong | Verify `examples/extraction/few-shot/01/input.html` exists |
|
||||||
| 13 | `NuGet restore error` for SemanticKernel | Package not in GV Artifactory feed | Request mirroring or temporary nuget.org source |
|
| 13 | `NuGet restore error` for SemanticKernel | Package not in GV Artifactory feed | Request mirroring or temporary nuget.org source |
|
||||||
| 14 | `HttpRequestException: Connection refused` | CLIProxyAPI not running | Start proxy; verify with `curl http://localhost:8317/v1/models` |
|
| 14 | `CredentialUnavailableException` from DefaultAzureCredential | Developer not logged in via Azure CLI | Run `az login --tenant <tenant-id>`, restart CRC.Server |
|
||||||
|
| 14b | HTTP 403 from Azure OpenAI | Azure AD identity lacks role | Grant "Cognitive Services OpenAI User" on the Azure OpenAI resource |
|
||||||
|
| 14c | All tokens arrive at once (no streaming) | Response compression or proxy buffering | Use stream-test diagnostic endpoint; check `UseResponseCompression()`; set `X-Accel-Buffering: no` header |
|
||||||
|
| 14d | Streaming works in curl but not in browser | Response compression only applied for browser Accept-Encoding | Add `Response.Headers["Content-Encoding"] = "identity"` in SSE endpoints |
|
||||||
| 15 | Drag-drop file not triggering extraction | `file-drop.js` not loaded | Check `<script>` tag in index.html; check browser console for JS errors |
|
| 15 | Drag-drop file not triggering extraction | `file-drop.js` not loaded | Check `<script>` tag in index.html; check browser console for JS errors |
|
||||||
| 16 | `window.fileDrop is undefined` | Script loaded after Blazor framework init | Move `<script>` tag before `_framework/blazor.webassembly.js` |
|
| 16 | `window.fileDrop is undefined` | Script loaded after Blazor framework init | Move `<script>` tag before `_framework/blazor.webassembly.js` |
|
||||||
| 17 | `JsonException` when parsing SSE data | SSE line doesn't match expected format | Add logging for raw SSE lines; check server-side WriteSSEAsync format |
|
| 17 | `JsonException` when parsing SSE data | SSE line doesn't match expected format | Add logging for raw SSE lines; check server-side WriteSSEAsync format |
|
||||||
@@ -627,6 +703,20 @@ Check CRC's MudBlazor version first: `grep MudBlazor CRC.Client.csproj`
|
|||||||
- **NuGet source:** Available on nuget.org. If CRC's GV Artifactory doesn't mirror it, this is a blocker — request mirroring.
|
- **NuGet source:** Available on nuget.org. If CRC's GV Artifactory doesn't mirror it, this is a blocker — request mirroring.
|
||||||
- **Size:** ~5MB total with dependencies
|
- **Size:** ~5MB total with dependencies
|
||||||
|
|
||||||
|
### Microsoft.SemanticKernel.Connectors.AzureOpenAI
|
||||||
|
- **Why needed:** Azure OpenAI-specific connector for SK (provides `AddAzureOpenAIChatCompletion()`)
|
||||||
|
- **.NET compatibility:** Same as core SK package
|
||||||
|
- **Transitive dependencies:** Pulls in `Azure.AI.OpenAI` SDK
|
||||||
|
- **NuGet source:** Same as core SK — nuget.org
|
||||||
|
- **Note:** This is separate from the core SK package. Without it, only `AddOpenAIChatCompletion()` is available (for non-Azure endpoints).
|
||||||
|
|
||||||
|
### Azure.Identity
|
||||||
|
- **Why needed:** Provides `DefaultAzureCredential` for Azure AD authentication to Azure OpenAI
|
||||||
|
- **.NET compatibility:** .NET Standard 2.0+ (compatible with everything)
|
||||||
|
- **CRC likely already has this** — it uses `Microsoft.Identity.Web` for Azure AD auth. Check `grep Azure.Identity CRC.Server.csproj`.
|
||||||
|
- **Version conflicts:** If CRC has an older version, SK may pull in a newer one. Usually compatible, but verify with `dotnet build`.
|
||||||
|
- **NuGet source:** Available on nuget.org and commonly mirrored in enterprise feeds
|
||||||
|
|
||||||
### Markdig (1.1.1)
|
### Markdig (1.1.1)
|
||||||
- **Why needed:** Markdown → HTML conversion for rendering LLM responses
|
- **Why needed:** Markdown → HTML conversion for rendering LLM responses
|
||||||
- **.NET compatibility:** .NET Standard 2.0+ (compatible with everything)
|
- **.NET compatibility:** .NET Standard 2.0+ (compatible with everything)
|
||||||
@@ -664,8 +754,10 @@ If the feature needs to be removed:
|
|||||||
|
|
||||||
**NuGet packages to remove:**
|
**NuGet packages to remove:**
|
||||||
- `Microsoft.SemanticKernel` from CRC.Server
|
- `Microsoft.SemanticKernel` from CRC.Server
|
||||||
|
- `Microsoft.SemanticKernel.Connectors.AzureOpenAI` from CRC.Server
|
||||||
|
- `Azure.Identity` from CRC.Server (only if not used by other CRC features — likely IS used, so leave it)
|
||||||
- `Markdig` from CRC.Client (if not used by other features)
|
- `Markdig` from CRC.Client (if not used by other features)
|
||||||
|
|
||||||
**Config keys to remove:**
|
**Config keys to remove:**
|
||||||
- `NlxvaPricer:*` section from `appsettings.json`
|
- `NlxvaPricer:*` section (AzureOpenAIEndpoint, DeploymentName, TenantId, FewShotPath) from `appsettings.json`
|
||||||
- `ExternalApis:*` section (if only used by this feature)
|
- `ExternalApis:*` section (if only used by this feature)
|
||||||
|
|||||||
@@ -58,6 +58,8 @@ This feature is a GUEST in CRC. Existing code, patterns, and conventions take ab
|
|||||||
|
|
||||||
Add to `CRC.Server`:
|
Add to `CRC.Server`:
|
||||||
- `Microsoft.SemanticKernel` (latest stable, >=1.x)
|
- `Microsoft.SemanticKernel` (latest stable, >=1.x)
|
||||||
|
- `Microsoft.SemanticKernel.Connectors.AzureOpenAI` (for Azure OpenAI connector)
|
||||||
|
- `Azure.Identity` (for `DefaultAzureCredential` — CRC may already have this)
|
||||||
- `Markdig` 1.1.1 (if CRC.Client doesn't already have it — check first)
|
- `Markdig` 1.1.1 (if CRC.Client doesn't already have it — check first)
|
||||||
|
|
||||||
No new packages for CRC.Client or CRC.Shared (MudBlazor already present).
|
No new packages for CRC.Client or CRC.Shared (MudBlazor already present).
|
||||||
@@ -73,7 +75,7 @@ CRC.Server (ASP.NET Core)
|
|||||||
├── NlxvaPricerController
|
├── NlxvaPricerController
|
||||||
│ ├── POST /api/nlxva-pricer/chat (general chat)
|
│ ├── POST /api/nlxva-pricer/chat (general chat)
|
||||||
│ └── POST /api/nlxva-pricer/extract (email extraction)
|
│ └── POST /api/nlxva-pricer/extract (email extraction)
|
||||||
│ Uses: Semantic Kernel → CLIProxyAPI (OpenAI-compatible proxy)
|
│ Uses: Semantic Kernel → Azure OpenAI (via DefaultAzureCredential)
|
||||||
│ Uses: ExtractionPlugin (tool calling)
|
│ Uses: ExtractionPlugin (tool calling)
|
||||||
│ Uses: FewShotService (example loading)
|
│ Uses: FewShotService (example loading)
|
||||||
├── Services/
|
├── Services/
|
||||||
@@ -232,9 +234,9 @@ data: {"error":"message"}\n\n ← on failure (followed by [DONE])
|
|||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"NlxvaPricer": {
|
"NlxvaPricer": {
|
||||||
"LlmBaseUrl": "http://localhost:8317/v1",
|
"AzureOpenAIEndpoint": "https://your-resource.openai.azure.com/",
|
||||||
"LlmModel": "claude-sonnet-4-6",
|
"DeploymentName": "gpt4o-prod",
|
||||||
"LlmApiKey": "not-needed",
|
"TenantId": "<your-azure-ad-tenant-id>",
|
||||||
"FewShotPath": "examples/extraction"
|
"FewShotPath": "examples/extraction"
|
||||||
},
|
},
|
||||||
"ExternalApis": {
|
"ExternalApis": {
|
||||||
@@ -245,6 +247,11 @@ data: {"error":"message"}\n\n ← on failure (followed by [DONE])
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If using API key auth instead of Azure AD, replace `TenantId` with:
|
||||||
|
```json
|
||||||
|
"ApiKey": "<your-azure-openai-api-key>"
|
||||||
|
```
|
||||||
|
|
||||||
## Critical Patterns
|
## Critical Patterns
|
||||||
|
|
||||||
### 1. SSE streaming in Blazor WASM — DO NOT use `reader.EndOfStream`
|
### 1. SSE streaming in Blazor WASM — DO NOT use `reader.EndOfStream`
|
||||||
@@ -276,16 +283,33 @@ while ((line = await reader.ReadLineAsync()) != null) // ← NOT EndOfStream
|
|||||||
`SetBrowserResponseStreamingEnabled(true)` is a Blazor WASM extension that tells the browser Fetch API
|
`SetBrowserResponseStreamingEnabled(true)` is a Blazor WASM extension that tells the browser Fetch API
|
||||||
to expose the response as a ReadableStream. Without it, the browser buffers the entire response.
|
to expose the response as a ReadableStream. Without it, the browser buffers the entire response.
|
||||||
|
|
||||||
### 2. Semantic Kernel base URL must include `/v1`
|
### 2. Azure OpenAI: use deployment name, NOT model name; NO `/v1` suffix
|
||||||
|
|
||||||
**Why:** The OpenAI SDK appends `chat/completions` directly to the base URL.
|
**Why:** Azure OpenAI uses `AddAzureOpenAIChatCompletion()`, not `AddOpenAIChatCompletion()`.
|
||||||
Without `/v1`, requests hit `/chat/completions` instead of `/v1/chat/completions` → 404.
|
The endpoint is your Azure resource URL (no `/v1` — the Azure SDK constructs the path internally).
|
||||||
|
The `deploymentName` is the name you gave the deployment in Azure portal, not the model name.
|
||||||
|
Auth uses `DefaultAzureCredential` with the tenant ID, not an API key.
|
||||||
|
|
||||||
```csharp
|
```csharp
|
||||||
builder.Services.AddOpenAIChatCompletion(
|
using Azure.Identity;
|
||||||
modelId: model,
|
|
||||||
endpoint: new Uri("http://localhost:8317/v1"), // ← MUST include /v1
|
builder.Services.AddAzureOpenAIChatCompletion(
|
||||||
apiKey: "not-needed");
|
deploymentName: builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod",
|
||||||
|
endpoint: builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
|
||||||
|
?? "https://your-resource.openai.azure.com/",
|
||||||
|
credentials: new DefaultAzureCredential(
|
||||||
|
new DefaultAzureCredentialOptions
|
||||||
|
{
|
||||||
|
TenantId = builder.Configuration["NlxvaPricer:TenantId"]
|
||||||
|
}));
|
||||||
|
```
|
||||||
|
|
||||||
|
If using API key instead of Azure AD:
|
||||||
|
```csharp
|
||||||
|
builder.Services.AddAzureOpenAIChatCompletion(
|
||||||
|
deploymentName: "gpt4o-prod",
|
||||||
|
endpoint: "https://your-resource.openai.azure.com/",
|
||||||
|
apiKey: builder.Configuration["NlxvaPricer:ApiKey"]);
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. Layout height depends on AppBar height
|
### 3. Layout height depends on AppBar height
|
||||||
@@ -371,18 +395,72 @@ if (!string.IsNullOrEmpty(parsedText))
|
|||||||
yield return parsedText;
|
yield return parsedText;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### 8. SSE response buffering — verify both streaming hops
|
||||||
|
|
||||||
|
**Why:** The architecture has two streaming hops: Azure OpenAI → CRC.Server → Browser.
|
||||||
|
If anything buffers in either hop, the user sees no tokens until the full response completes.
|
||||||
|
Common buffers: response compression middleware, reverse proxies (NGINX/IIS), Azure API Management.
|
||||||
|
|
||||||
|
**Diagnostic endpoint (add temporarily, remove after verifying):**
|
||||||
|
```csharp
|
||||||
|
[HttpGet("stream-test")]
|
||||||
|
public async Task StreamTest()
|
||||||
|
{
|
||||||
|
Response.ContentType = "text/event-stream";
|
||||||
|
Response.Headers["Cache-Control"] = "no-cache";
|
||||||
|
Response.Headers["X-Accel-Buffering"] = "no"; // NGINX hint
|
||||||
|
|
||||||
|
var chatService = _kernel.GetRequiredService<IChatCompletionService>();
|
||||||
|
var history = new ChatHistory();
|
||||||
|
history.AddUserMessage("Count from 1 to 10, one number per line.");
|
||||||
|
|
||||||
|
var sw = System.Diagnostics.Stopwatch.StartNew();
|
||||||
|
await foreach (var chunk in chatService.GetStreamingChatMessageContentsAsync(history))
|
||||||
|
{
|
||||||
|
if (!string.IsNullOrEmpty(chunk.Content))
|
||||||
|
{
|
||||||
|
await Response.WriteAsync($"data: [{sw.ElapsedMilliseconds}ms] {chunk.Content}\n\n");
|
||||||
|
await Response.Body.FlushAsync();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
await Response.WriteAsync("data: [DONE]\n\n");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Test with:** `curl -N https://localhost:7100/api/nlxva-pricer/stream-test`
|
||||||
|
- Timestamps spread over seconds = streaming works
|
||||||
|
- All timestamps clustered at the end = something is buffering
|
||||||
|
|
||||||
|
**If CRC.Server uses `UseResponseCompression()`, exclude SSE:**
|
||||||
|
```csharp
|
||||||
|
Response.Headers["Content-Encoding"] = "identity"; // opt out per-response
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response headers to always set on SSE endpoints:**
|
||||||
|
```csharp
|
||||||
|
Response.ContentType = "text/event-stream";
|
||||||
|
Response.Headers["Cache-Control"] = "no-cache";
|
||||||
|
Response.Headers["X-Accel-Buffering"] = "no"; // prevents NGINX buffering
|
||||||
|
```
|
||||||
|
|
||||||
## Wiring
|
## Wiring
|
||||||
|
|
||||||
### CRC.Server DI registration order (add to existing Program.cs / Startup.cs)
|
### CRC.Server DI registration order (add to existing Program.cs / Startup.cs)
|
||||||
|
|
||||||
```csharp
|
```csharp
|
||||||
// 1. Semantic Kernel — OpenAI-compatible connector
|
// 1. Semantic Kernel — Azure OpenAI connector with Azure AD auth
|
||||||
var llmBaseUrl = builder.Configuration["NlxvaPricer:LlmBaseUrl"] ?? "http://localhost:8317/v1";
|
using Azure.Identity;
|
||||||
var llmModel = builder.Configuration["NlxvaPricer:LlmModel"] ?? "claude-sonnet-4-6";
|
|
||||||
builder.Services.AddOpenAIChatCompletion(
|
var azureEndpoint = builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
|
||||||
modelId: llmModel,
|
?? "https://your-resource.openai.azure.com/";
|
||||||
endpoint: new Uri(llmBaseUrl),
|
var deploymentName = builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod";
|
||||||
apiKey: builder.Configuration["NlxvaPricer:LlmApiKey"] ?? "not-needed");
|
var tenantId = builder.Configuration["NlxvaPricer:TenantId"];
|
||||||
|
|
||||||
|
builder.Services.AddAzureOpenAIChatCompletion(
|
||||||
|
deploymentName: deploymentName,
|
||||||
|
endpoint: azureEndpoint,
|
||||||
|
credentials: new DefaultAzureCredential(
|
||||||
|
new DefaultAzureCredentialOptions { TenantId = tenantId }));
|
||||||
builder.Services.AddKernel();
|
builder.Services.AddKernel();
|
||||||
|
|
||||||
// 2. External API typed HttpClients
|
// 2. External API typed HttpClients
|
||||||
|
|||||||
Reference in New Issue
Block a user