Replace CLIProxyAPI/local proxy references with Azure OpenAI using DefaultAzureCredential and tenant ID auth. Add Critical Pattern #8 for SSE buffering diagnostics with timestamped curl test. Add streaming verification tasks (T6b, T15) and troubleshooting entries for Azure AD auth, RBAC, response compression, and proxy buffering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
24 KiB
Feature: Natural Language XVA Pricer
Target: CRC (Blazor WASM Hosted / ASP.NET Core / MudBlazor / .NET 8.0)
Source: ChatAgent — cumulative export of all 12 changes
Includes: chat-ui, chat-streaming, semantic-kernel, multi-turn, rich-text, sidebar-nav, prompt-settings, extraction-schema, extraction-tools, few-shot-prompting, extraction-endpoint, email-upload
Skipped: migrate-claude-md-to-openspec (documentation only), add-test-coverage (adapt to CRC test conventions separately)
Integration Rule
This feature is a GUEST in CRC. Existing code, patterns, and conventions take absolute precedence.
- DO NOT modify existing files, components, layouts, services, routing, or DI registrations in CRC
- DO NOT replace existing patterns (e.g., if CRC uses a different HttpClient pattern, use theirs)
- DO add new files, new nav links, new routes, new DI registrations
- DO conform to CRC naming conventions:
E-prefix enums,I-prefix interfaces,*Dto/*Request/*ResponseDTOs, PascalCase constants,{Subject}Testtest classes - DO use CRC.Shared for DTOs (not a new shared project)
- If a task conflicts with existing CRC code, STOP and ask the user
- If CRC already has an equivalent service (HttpClient wrapper, markdown renderer), use the existing one
Adapt-to-target notes
- CRC uses
CRC.Server(not a standalone API project) — add controller and services there - CRC uses
CRC.Client— add pages, layout changes, client services there - CRC uses
CRC.Shared— add DTOs there - CRC uses Scrutor for DI assembly scanning — register new services compatibly
- CRC uses Fluxor for client state — this feature uses local component state (no Fluxor needed), which is fine for an isolated page
- CRC uses Serilog — use
ILogger<T>via DI (Serilog handles the sink) - CRC uses Azure AD auth in prod, DevAuth in dev — add
[Authorize]if CRC controllers require it - CRC uses
gv_web_config.csvas primary config — put LLM config inappsettings.json(secondary config) where CRC already stores Serilog/DevAuth settings - CRC AppBar is regular height (64px), not Dense (48px) — adjust CSS calc accordingly
Target Layout
+------------------------------------------------------------------+
| CRC AppBar (64px, blue, Elevation 1) |
| [=] CRC 0.0.0 APR-CRC-PROD-LDN-DEV |
+------+-----------------------------------------------------------+
| Drawer| MudMainContent |
| Home | |
| Pricer| (routed page content) |
| Mkt | |
| XVA | |
| Sales | |
|>NLPric| <-- NEW: "NL XVA Pricer" nav item, route /nlxva-pricer |
| | |
+------+-----------------------------------------------------------+
- Feature name: NL XVA Pricer (short for Natural Language XVA Pricer)
- Route:
/nlxva-pricer - Navigation: new MudNavLink in the existing NavMenu component
- Icon:
Icons.Material.Filled.SmartToy - AppBar height: 64px (CRC uses regular, NOT Dense)
- CSS viewport calc:
calc(100vh - 64px)(NOT 48px)
Packages
Add to CRC.Server:
Microsoft.SemanticKernel(latest stable, >=1.x)Microsoft.SemanticKernel.Connectors.AzureOpenAI(for Azure OpenAI connector)Azure.Identity(forDefaultAzureCredential— CRC may already have this)Markdig1.1.1 (if CRC.Client doesn't already have it — check first)
No new packages for CRC.Client or CRC.Shared (MudBlazor already present).
Architecture
CRC.Client (WASM)
|
| HTTP REST (SSE streaming)
|
CRC.Server (ASP.NET Core)
├── NlxvaPricerController
│ ├── POST /api/nlxva-pricer/chat (general chat)
│ └── POST /api/nlxva-pricer/extract (email extraction)
│ Uses: Semantic Kernel → Azure OpenAI (via DefaultAzureCredential)
│ Uses: ExtractionPlugin (tool calling)
│ Uses: FewShotService (example loading)
├── Services/
│ ├── FewShotService (singleton, loads examples at startup)
│ ├── CounterpartyApiClient (typed HttpClient)
│ ├── TradeApiClient (typed HttpClient)
│ └── CurrencyApiClient (typed HttpClient)
├── Plugins/
│ └── ExtractionPlugin ([KernelFunction] tools)
├── CRC.Shared (DTOs)
└── CRC.Component (if reusable Blazor components needed)
Two endpoints, same SSE streaming contract. General chat supports system prompt + model settings. Extraction uses few-shot prefix (not user system prompt) and extraction-specific tools.
Components
Page: NlxvaPricer.razor → CRC.Client/Pages/NlxvaPricer.razor
- Route:
@page "/nlxva-pricer" - MudTabs with 3 panels: Chat, System Prompt, Model Settings (KeepPanelsAlive=true)
- Chat panel: message list (scrollable), input area (text field + send + upload button), drag-drop zone
- Extraction mode: tracked by
_isExtractionModebool; routes subsequent messages to extract endpoint - Streaming: consumes
IAsyncEnumerable<string>, appends token-by-token to assistant message - Markdown rendering: assistant messages rendered via MarkdownService + MarkupString
- HTML render cache:
Dictionary<ChatMessage, string>avoids re-running Markdig on completed messages - JS interop: auto-scroll, drag-and-drop file handling via
file-drop.js
Client service: NlxvaPricerApiClient → CRC.Client/Services/NlxvaPricerApiClient.cs
- Typed HttpClient wrapper
SendChatStreamingAsync(NlxvaChatRequest)→ POST /api/nlxva-pricer/chat, returnsIAsyncEnumerable<string>SendExtractionStreamingAsync(NlxvaExtractionRequest)→ POST /api/nlxva-pricer/extract, returnsIAsyncEnumerable<string>- SSE parsing: read line-by-line, extract
data: {"text":"..."}events, yield text deltas, stop at[DONE]
Client service: MarkdownService → CRC.Client/Services/MarkdownService.cs
- Markdig pipeline with
UseAdvancedExtensions() - HTML sanitization via tag/attribute allowlist (p, h1-h6, strong, em, code, pre, ul, ol, li, a[href], table/thead/tbody/tr/th/td, br, blockquote)
- Strips
<script>,<style>blocks entirely, strips event handler attributes - Singleton registration
Controller: NlxvaPricerController → CRC.Server/Controllers/NlxvaPricerController.cs
[Route("api/nlxva-pricer")]POST /(chat): builds SK ChatHistory from messages + optional system prompt, streams SSEPOST /extract: builds ChatHistory from FewShotService prefix + email, streams SSE- Both endpoints: import ExtractionPlugin, enable
FunctionChoiceBehavior.Auto() - SSE format:
data: {"text":"..."}\n\nper token,data: [DONE]\n\nat end,data: {"error":"..."}\n\non failure
Plugin: ExtractionPlugin → CRC.Server/Plugins/ExtractionPlugin.cs
- 4
[KernelFunction]methods:lookup_counterparty(string name)→ calls CounterpartyApiClient, returns JSON ValidationResultvalidate_trade(long tradeId)→ calls TradeApiClientvalidate_currency(string currencyCode)→ calls CurrencyApiClientvalidate_schema(string extractionResultJson)→ local JSON validation against TradeItem schema
- All return serialized
ValidationResultJSON (so LLM can reason about it) - HTTP errors caught and returned as structured messages (not thrown)
Service: FewShotService → CRC.Server/Services/FewShotService.cs
- Loads instruction template + few-shot examples from disk at startup
- Caches a
ChatHistoryprefix (system message + alternating user/assistant example turns) CloneWithEmail(string emailHtml)→ clones prefix + appends email as final user messageCloneWithEmailAndMessages(string emailHtml, List<NlxvaChatMessage> messages)→ for follow-ups- Singleton lifetime
API Clients: CounterpartyApiClient, TradeApiClient, CurrencyApiClient
- Each: typed HttpClient with single async method wrapping an external API call
- Registered via
AddHttpClient<T>()with base URL from appsettings.json - CounterpartyApiClient.LookupAsync(name) →
GET lookup?name={name}→List<CandidateMatch> - TradeApiClient.ValidateAsync(tradeId) →
GET validate/{tradeId}→TradeValidationResponse - CurrencyApiClient.ValidateAsync(code) →
GET validate/{code}→CurrencyValidationResponse
JS: file-drop.js → CRC.Client/wwwroot/js/file-drop.js
- Registers dragover/dragenter/dragleave/drop handlers on a CSS-selector target
- Reads dropped file as text via FileReader
- Calls back to .NET via
DotNetObjectReference.invokeMethodAsync
Contracts
DTOs (all in CRC.Shared namespace, adapt naming to CRC conventions)
// NlxvaChatMessage.cs
public class NlxvaChatMessage
{
public string Role { get; set; } = string.Empty; // "user" | "assistant"
public string Content { get; set; } = string.Empty;
public DateTime Timestamp { get; set; }
}
// NlxvaChatRequest.cs — POST /api/nlxva-pricer/chat
public class NlxvaChatRequest
{
public List<NlxvaChatMessage> Messages { get; set; } = new();
public string? SystemPrompt { get; set; }
public NlxvaModelSettings? Settings { get; set; }
}
// NlxvaModelSettings.cs
public class NlxvaModelSettings
{
public double? Temperature { get; set; } // 0.0–2.0
public double? TopP { get; set; } // 0.0–1.0
public int? MaxTokens { get; set; } // 1–4096
}
// NlxvaExtractionRequest.cs — POST /api/nlxva-pricer/extract
public class NlxvaExtractionRequest
{
public string EmailHtml { get; set; } = string.Empty;
public List<NlxvaChatMessage> Messages { get; set; } = new();
}
// TradeItem.cs — snake_case JSON for downstream systems
public class TradeItem
{
[JsonPropertyName("valuedate")] public string? Valuedate { get; set; }
[JsonPropertyName("counterparty")] public string? Counterparty { get; set; }
[JsonPropertyName("legal_entity")] public string? LegalEntity { get; set; }
[JsonPropertyName("trade_id")] public long TradeId { get; set; }
[JsonPropertyName("display_ccy")] public string? DisplayCcy { get; set; }
[JsonPropertyName("pv")] public double Pv { get; set; }
[JsonPropertyName("breakclause")] public string? Breakclause { get; set; }
}
// NlxvaExtractionResult.cs
public class NlxvaExtractionResult
{
[JsonPropertyName("items")]
public List<TradeItem> Items { get; set; } = new();
}
// NlxvaValidationResult.cs
public class NlxvaValidationResult
{
public bool IsValid { get; set; }
public List<string> Errors { get; set; } = new();
public List<NlxvaCandidateMatch>? Candidates { get; set; }
}
public class NlxvaCandidateMatch
{
public string Name { get; set; } = string.Empty;
public string LegalEntity { get; set; } = string.Empty;
}
SSE Wire Format
data: {"text":"token here"}\n\n ← per token
data: [DONE]\n\n ← stream complete
data: {"error":"message"}\n\n ← on failure (followed by [DONE])
Config keys (appsettings.json)
{
"NlxvaPricer": {
"AzureOpenAIEndpoint": "https://your-resource.openai.azure.com/",
"DeploymentName": "gpt4o-prod",
"TenantId": "<your-azure-ad-tenant-id>",
"FewShotPath": "examples/extraction"
},
"ExternalApis": {
"CounterpartyBaseUrl": "http://localhost:5000/api/counterparty",
"TradeBaseUrl": "http://localhost:5000/api/trade",
"CurrencyBaseUrl": "http://localhost:5000/api/currency"
}
}
If using API key auth instead of Azure AD, replace TenantId with:
"ApiKey": "<your-azure-openai-api-key>"
Critical Patterns
1. SSE streaming in Blazor WASM — DO NOT use reader.EndOfStream
Why: EndOfStream performs a synchronous peek read. Blazor WASM's async streaming pipeline
does not support synchronous reads — it will hang or throw.
Copy this pattern:
httpRequest.SetBrowserResponseStreamingEnabled(true);
using var response = await _httpClient.SendAsync(
httpRequest, HttpCompletionOption.ResponseHeadersRead);
response.EnsureSuccessStatusCode();
using var stream = await response.Content.ReadAsStreamAsync();
using var reader = new StreamReader(stream);
string? line;
while ((line = await reader.ReadLineAsync()) != null) // ← NOT EndOfStream
{
if (!line.StartsWith("data: ")) continue;
var data = line.Substring(6);
if (data == "[DONE]") yield break;
// parse {"text":"..."} and yield
}
SetBrowserResponseStreamingEnabled(true) is a Blazor WASM extension that tells the browser Fetch API
to expose the response as a ReadableStream. Without it, the browser buffers the entire response.
2. Azure OpenAI: use deployment name, NOT model name; NO /v1 suffix
Why: Azure OpenAI uses AddAzureOpenAIChatCompletion(), not AddOpenAIChatCompletion().
The endpoint is your Azure resource URL (no /v1 — the Azure SDK constructs the path internally).
The deploymentName is the name you gave the deployment in Azure portal, not the model name.
Auth uses DefaultAzureCredential with the tenant ID, not an API key.
using Azure.Identity;
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod",
endpoint: builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
?? "https://your-resource.openai.azure.com/",
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions
{
TenantId = builder.Configuration["NlxvaPricer:TenantId"]
}));
If using API key instead of Azure AD:
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: "gpt4o-prod",
endpoint: "https://your-resource.openai.azure.com/",
apiKey: builder.Configuration["NlxvaPricer:ApiKey"]);
3. Layout height depends on AppBar height
Why: CRC uses a regular AppBar (64px), not Dense (48px). Magic CSS values must match.
::deep .tab-container {
height: calc(100vh - 64px); /* 64px = CRC regular AppBar height */
}
If CRC uses 100dvh elsewhere, prefer that over 100vh for mobile viewport correctness.
4. Markdown render caching during streaming
Why: Without caching, every StateHasChanged() during streaming re-runs Markdig on ALL messages,
causing visible lag as conversation grows. Only the streaming message should re-render.
private readonly Dictionary<NlxvaChatMessage, string> _renderedHtmlCache = new();
private string GetRenderedHtml(NlxvaChatMessage message)
{
if (_renderedHtmlCache.TryGetValue(message, out var cached))
return cached;
return Markdown.ConvertToHtml(message.Content);
}
// In finally block after streaming completes:
_renderedHtmlCache[assistantMessage] = Markdown.ConvertToHtml(assistantMessage.Content);
5. ExtractionPlugin tool results must be serialized JSON strings
Why: SK passes the return value as a string to the LLM. The LLM needs structured JSON to reason about validation results, error messages, and candidate lists.
[KernelFunction("lookup_counterparty")]
[Description("Looks up counterparty candidates by name...")]
public async Task<string> LookupCounterparty(string name)
{
var result = new NlxvaValidationResult();
// ... populate result ...
return JsonSerializer.Serialize(result); // ← return JSON string, not object
}
6. Per-request plugin import (not global)
Why: Plugins that depend on scoped services (typed HttpClients) must be imported per-request, not registered globally on the Kernel at startup.
var extractionPlugin = HttpContext.RequestServices.GetRequiredService<ExtractionPlugin>();
_kernel.ImportPluginFromObject(extractionPlugin, "Extraction");
7. C# yield cannot appear inside try-catch
Why: Language restriction. SSE parsing needs to parse JSON (can throw) and yield (can't be in try). Solution: parse into local variables first, yield outside try block.
string? parsedText = null;
string? parsedError = null;
try
{
using var doc = JsonDocument.Parse(data);
var root = doc.RootElement;
if (root.TryGetProperty("error", out var err))
parsedError = err.GetString();
else if (root.TryGetProperty("text", out var txt))
parsedText = txt.GetString();
}
catch (JsonException) { /* skip malformed */ }
if (parsedError != null)
throw new HttpRequestException($"API error: {parsedError}");
if (!string.IsNullOrEmpty(parsedText))
yield return parsedText;
8. SSE response buffering — verify both streaming hops
Why: The architecture has two streaming hops: Azure OpenAI → CRC.Server → Browser. If anything buffers in either hop, the user sees no tokens until the full response completes. Common buffers: response compression middleware, reverse proxies (NGINX/IIS), Azure API Management.
Diagnostic endpoint (add temporarily, remove after verifying):
[HttpGet("stream-test")]
public async Task StreamTest()
{
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // NGINX hint
var chatService = _kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddUserMessage("Count from 1 to 10, one number per line.");
var sw = System.Diagnostics.Stopwatch.StartNew();
await foreach (var chunk in chatService.GetStreamingChatMessageContentsAsync(history))
{
if (!string.IsNullOrEmpty(chunk.Content))
{
await Response.WriteAsync($"data: [{sw.ElapsedMilliseconds}ms] {chunk.Content}\n\n");
await Response.Body.FlushAsync();
}
}
await Response.WriteAsync("data: [DONE]\n\n");
}
Test with: curl -N https://localhost:7100/api/nlxva-pricer/stream-test
- Timestamps spread over seconds = streaming works
- All timestamps clustered at the end = something is buffering
If CRC.Server uses UseResponseCompression(), exclude SSE:
Response.Headers["Content-Encoding"] = "identity"; // opt out per-response
Response headers to always set on SSE endpoints:
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // prevents NGINX buffering
Wiring
CRC.Server DI registration order (add to existing Program.cs / Startup.cs)
// 1. Semantic Kernel — Azure OpenAI connector with Azure AD auth
using Azure.Identity;
var azureEndpoint = builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
?? "https://your-resource.openai.azure.com/";
var deploymentName = builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod";
var tenantId = builder.Configuration["NlxvaPricer:TenantId"];
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: deploymentName,
endpoint: azureEndpoint,
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions { TenantId = tenantId }));
builder.Services.AddKernel();
// 2. External API typed HttpClients
builder.Services.AddHttpClient<CounterpartyApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ExternalApis:CounterpartyBaseUrl"]
?? "http://localhost:5000/api/counterparty"));
builder.Services.AddHttpClient<TradeApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ExternalApis:TradeBaseUrl"]
?? "http://localhost:5000/api/trade"));
builder.Services.AddHttpClient<CurrencyApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ExternalApis:CurrencyBaseUrl"]
?? "http://localhost:5000/api/currency"));
// 3. FewShotService (singleton — loads examples once at startup)
var fewShotPath = builder.Configuration["NlxvaPricer:FewShotPath"] ?? "examples/extraction";
var fewShotAbsPath = Path.IsPathRooted(fewShotPath)
? fewShotPath
: Path.Combine(builder.Environment.ContentRootPath, fewShotPath);
builder.Services.AddSingleton(new FewShotService(fewShotAbsPath));
// 4. ExtractionPlugin (scoped — depends on scoped HttpClients)
builder.Services.AddScoped<ExtractionPlugin>();
CRC.Client DI registration (add to existing Program.cs)
// Typed HttpClient for the NL XVA Pricer API
builder.Services.AddHttpClient<NlxvaPricerApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ApiBaseUrl"]
?? "https://localhost:7100/"));
// Markdown rendering (singleton — thread-safe, reusable)
builder.Services.AddSingleton<MarkdownService>();
CRC.Client NavMenu (add new MudNavLink)
<MudNavLink Href="/nlxva-pricer"
Icon="@Icons.Material.Filled.SmartToy"
Match="NavLinkMatch.All">
NL XVA Pricer
</MudNavLink>
CRC.Client index.html (add JS reference)
<script src="js/file-drop.js"></script>
CRC.Server CORS (if not already allowing the client origin)
Ensure the CORS policy allows the CRC.Client origin for the new endpoints.
Examples folder
Copy the examples/extraction/ folder to the CRC.Server project root:
examples/extraction/
├── instruction-template.txt
└── few-shot/
├── 01/
│ ├── input.html
│ └── output.json
├── 02/
│ ├── input.html
│ └── output.json
└── 03/
├── input.html
└── output.json
Behavior
- Extraction mode routing: When an email is uploaded,
_isExtractionMode = true. All subsequent text messages route to/extract(not/chat) until "New Chat" resets - Follow-up disambiguation: The extraction endpoint receives full conversation history (email + all prior exchanges) so the agent has context for disambiguation
- Upload message: File upload adds a user message
[Uploaded: filename.html]to the chat before streaming the extraction response - File validation: Only
.htmlfiles accepted (both drag-drop and file picker). Others show MudAlert warning - Streaming guard: Input field, send button, upload button, and drop zone all disabled during streaming
- Multi-turn context: General chat sends full conversation history with every request
- System prompt: Only used for general chat, NOT for extraction (extraction uses fixed instruction template)
- Model settings: Only used for general chat, NOT for extraction
- Settings persistence: In-memory only (lost on page refresh) — acceptable for a debugging/iteration tool
- DotNetObjectReference disposal: Chat page implements IDisposable to dispose the JS interop reference
Few-Shot Instruction Template
The instruction template defines the extraction task. Content:
You are a trade data extraction agent. Your task is to extract structured trade data
from sales emails (typically CVA pricing requests) and return the result as JSON.
## Output Schema
Return a JSON object with an "items" array. Each item has:
- valuedate (string): dd/MM/yyyy format
- counterparty (string): full legal name from email
- trade_id (integer): Murex trade ID
- display_ccy (string): ISO currency code (£→GBP, $→USD, €→EUR)
- pv (number): plain number, no formatting
- breakclause (string): "Y" or "N" (default "N")
legal_entity is NOT included — populated later via lookup tool.
## Mapping Rules
1. FLATTEN: Each leg with unique Murex ID → separate item
2. DATE: Parse from context (e.g., "OB 27/11/2025" → "27/11/2025")
3. COUNTERPARTY: Full legal name exactly as written
4. CURRENCY: From PV column header (£→GBP, $→USD, €→EUR)
5. PV: Strip commas/symbols, plain number
6. BREAKCLAUSE: Default "N", only "Y" if explicitly mentioned
## After Extraction
Use tools: lookup_counterparty, validate_trade, validate_currency, validate_schema.
If multiple candidates, present numbered list and ask user to select.
Compression Stats
- Source code: ~3,200 lines across 25+ files
- This spec: ~350 lines
- Compression ratio: ~9:1
- Estimated typing: ~12,000 characters (vs ~110,000 for full code)