Files
AgenticCode/openspec/exports/nlxva-pricer-spec.md
local d46b179221 feat: add porting-guide skill and NL XVA Pricer export bundle
Add /opsx:porting-guide skill that generates detailed human-readable
implementation guides as a companion to /opsx:export-spec. The AI spec
targets the agent; the porting guide targets the human developer with
design rationale, task-by-task notes, troubleshooting tables, and
rollback plans.

Generate the full NL XVA Pricer export bundle for CRC:
- nlxva-pricer-spec.md (AI-targeted portable spec)
- nlxva-pricer-openspec.md (OpenSpec proposal/design/tasks)
- nlxva-pricer-porting-guide.md (human implementation guide)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:45:47 +01:00

513 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Feature: Natural Language XVA Pricer
## Target: CRC (Blazor WASM Hosted / ASP.NET Core / MudBlazor / .NET 8.0)
## Source: ChatAgent — cumulative export of all 12 changes
## Includes: chat-ui, chat-streaming, semantic-kernel, multi-turn, rich-text, sidebar-nav, prompt-settings, extraction-schema, extraction-tools, few-shot-prompting, extraction-endpoint, email-upload
## Skipped: migrate-claude-md-to-openspec (documentation only), add-test-coverage (adapt to CRC test conventions separately)
---
## Integration Rule
This feature is a GUEST in CRC. Existing code, patterns, and conventions take absolute precedence.
- **DO NOT** modify existing files, components, layouts, services, routing, or DI registrations in CRC
- **DO NOT** replace existing patterns (e.g., if CRC uses a different HttpClient pattern, use theirs)
- **DO** add new files, new nav links, new routes, new DI registrations
- **DO** conform to CRC naming conventions: `E`-prefix enums, `I`-prefix interfaces, `*Dto`/`*Request`/`*Response` DTOs, PascalCase constants, `{Subject}Test` test classes
- **DO** use CRC.Shared for DTOs (not a new shared project)
- If a task conflicts with existing CRC code, **STOP and ask the user**
- If CRC already has an equivalent service (HttpClient wrapper, markdown renderer), **use the existing one**
### Adapt-to-target notes
- CRC uses `CRC.Server` (not a standalone API project) — add controller and services there
- CRC uses `CRC.Client` — add pages, layout changes, client services there
- CRC uses `CRC.Shared` — add DTOs there
- CRC uses Scrutor for DI assembly scanning — register new services compatibly
- CRC uses Fluxor for client state — this feature uses local component state (no Fluxor needed), which is fine for an isolated page
- CRC uses Serilog — use `ILogger<T>` via DI (Serilog handles the sink)
- CRC uses Azure AD auth in prod, DevAuth in dev — add `[Authorize]` if CRC controllers require it
- CRC uses `gv_web_config.csv` as primary config — put LLM config in `appsettings.json` (secondary config) where CRC already stores Serilog/DevAuth settings
- CRC AppBar is regular height (64px), not Dense (48px) — adjust CSS calc accordingly
## Target Layout
```
+------------------------------------------------------------------+
| CRC AppBar (64px, blue, Elevation 1) |
| [=] CRC 0.0.0 APR-CRC-PROD-LDN-DEV |
+------+-----------------------------------------------------------+
| Drawer| MudMainContent |
| Home | |
| Pricer| (routed page content) |
| Mkt | |
| XVA | |
| Sales | |
|>NLPric| <-- NEW: "NL XVA Pricer" nav item, route /nlxva-pricer |
| | |
+------+-----------------------------------------------------------+
```
- Feature name: **NL XVA Pricer** (short for Natural Language XVA Pricer)
- Route: `/nlxva-pricer`
- Navigation: new MudNavLink in the existing NavMenu component
- Icon: `Icons.Material.Filled.SmartToy`
- AppBar height: 64px (CRC uses regular, NOT Dense)
- CSS viewport calc: `calc(100vh - 64px)` (NOT 48px)
## Packages
Add to `CRC.Server`:
- `Microsoft.SemanticKernel` (latest stable, >=1.x)
- `Markdig` 1.1.1 (if CRC.Client doesn't already have it — check first)
No new packages for CRC.Client or CRC.Shared (MudBlazor already present).
## Architecture
```
CRC.Client (WASM)
|
| HTTP REST (SSE streaming)
|
CRC.Server (ASP.NET Core)
├── NlxvaPricerController
│ ├── POST /api/nlxva-pricer/chat (general chat)
│ └── POST /api/nlxva-pricer/extract (email extraction)
│ Uses: Semantic Kernel → CLIProxyAPI (OpenAI-compatible proxy)
│ Uses: ExtractionPlugin (tool calling)
│ Uses: FewShotService (example loading)
├── Services/
│ ├── FewShotService (singleton, loads examples at startup)
│ ├── CounterpartyApiClient (typed HttpClient)
│ ├── TradeApiClient (typed HttpClient)
│ └── CurrencyApiClient (typed HttpClient)
├── Plugins/
│ └── ExtractionPlugin ([KernelFunction] tools)
├── CRC.Shared (DTOs)
└── CRC.Component (if reusable Blazor components needed)
```
Two endpoints, same SSE streaming contract. General chat supports system prompt + model settings.
Extraction uses few-shot prefix (not user system prompt) and extraction-specific tools.
## Components
### Page: `NlxvaPricer.razor` → `CRC.Client/Pages/NlxvaPricer.razor`
- Route: `@page "/nlxva-pricer"`
- MudTabs with 3 panels: Chat, System Prompt, Model Settings (KeepPanelsAlive=true)
- Chat panel: message list (scrollable), input area (text field + send + upload button), drag-drop zone
- Extraction mode: tracked by `_isExtractionMode` bool; routes subsequent messages to extract endpoint
- Streaming: consumes `IAsyncEnumerable<string>`, appends token-by-token to assistant message
- Markdown rendering: assistant messages rendered via MarkdownService + MarkupString
- HTML render cache: `Dictionary<ChatMessage, string>` avoids re-running Markdig on completed messages
- JS interop: auto-scroll, drag-and-drop file handling via `file-drop.js`
### Client service: `NlxvaPricerApiClient` → `CRC.Client/Services/NlxvaPricerApiClient.cs`
- Typed HttpClient wrapper
- `SendChatStreamingAsync(NlxvaChatRequest)` → POST /api/nlxva-pricer/chat, returns `IAsyncEnumerable<string>`
- `SendExtractionStreamingAsync(NlxvaExtractionRequest)` → POST /api/nlxva-pricer/extract, returns `IAsyncEnumerable<string>`
- SSE parsing: read line-by-line, extract `data: {"text":"..."}` events, yield text deltas, stop at `[DONE]`
### Client service: `MarkdownService` → `CRC.Client/Services/MarkdownService.cs`
- Markdig pipeline with `UseAdvancedExtensions()`
- HTML sanitization via tag/attribute allowlist (p, h1-h6, strong, em, code, pre, ul, ol, li, a[href], table/thead/tbody/tr/th/td, br, blockquote)
- Strips `<script>`, `<style>` blocks entirely, strips event handler attributes
- Singleton registration
### Controller: `NlxvaPricerController` → `CRC.Server/Controllers/NlxvaPricerController.cs`
- `[Route("api/nlxva-pricer")]`
- `POST /` (chat): builds SK ChatHistory from messages + optional system prompt, streams SSE
- `POST /extract`: builds ChatHistory from FewShotService prefix + email, streams SSE
- Both endpoints: import ExtractionPlugin, enable `FunctionChoiceBehavior.Auto()`
- SSE format: `data: {"text":"..."}\n\n` per token, `data: [DONE]\n\n` at end, `data: {"error":"..."}\n\n` on failure
### Plugin: `ExtractionPlugin` → `CRC.Server/Plugins/ExtractionPlugin.cs`
- 4 `[KernelFunction]` methods:
- `lookup_counterparty(string name)` → calls CounterpartyApiClient, returns JSON ValidationResult
- `validate_trade(long tradeId)` → calls TradeApiClient
- `validate_currency(string currencyCode)` → calls CurrencyApiClient
- `validate_schema(string extractionResultJson)` → local JSON validation against TradeItem schema
- All return serialized `ValidationResult` JSON (so LLM can reason about it)
- HTTP errors caught and returned as structured messages (not thrown)
### Service: `FewShotService` → `CRC.Server/Services/FewShotService.cs`
- Loads instruction template + few-shot examples from disk at startup
- Caches a `ChatHistory` prefix (system message + alternating user/assistant example turns)
- `CloneWithEmail(string emailHtml)` → clones prefix + appends email as final user message
- `CloneWithEmailAndMessages(string emailHtml, List<NlxvaChatMessage> messages)` → for follow-ups
- Singleton lifetime
### API Clients: `CounterpartyApiClient`, `TradeApiClient`, `CurrencyApiClient`
- Each: typed HttpClient with single async method wrapping an external API call
- Registered via `AddHttpClient<T>()` with base URL from appsettings.json
- CounterpartyApiClient.LookupAsync(name) → `GET lookup?name={name}``List<CandidateMatch>`
- TradeApiClient.ValidateAsync(tradeId) → `GET validate/{tradeId}``TradeValidationResponse`
- CurrencyApiClient.ValidateAsync(code) → `GET validate/{code}``CurrencyValidationResponse`
### JS: `file-drop.js` → `CRC.Client/wwwroot/js/file-drop.js`
- Registers dragover/dragenter/dragleave/drop handlers on a CSS-selector target
- Reads dropped file as text via FileReader
- Calls back to .NET via `DotNetObjectReference.invokeMethodAsync`
## Contracts
### DTOs (all in CRC.Shared namespace, adapt naming to CRC conventions)
```csharp
// NlxvaChatMessage.cs
public class NlxvaChatMessage
{
public string Role { get; set; } = string.Empty; // "user" | "assistant"
public string Content { get; set; } = string.Empty;
public DateTime Timestamp { get; set; }
}
// NlxvaChatRequest.cs — POST /api/nlxva-pricer/chat
public class NlxvaChatRequest
{
public List<NlxvaChatMessage> Messages { get; set; } = new();
public string? SystemPrompt { get; set; }
public NlxvaModelSettings? Settings { get; set; }
}
// NlxvaModelSettings.cs
public class NlxvaModelSettings
{
public double? Temperature { get; set; } // 0.02.0
public double? TopP { get; set; } // 0.01.0
public int? MaxTokens { get; set; } // 14096
}
// NlxvaExtractionRequest.cs — POST /api/nlxva-pricer/extract
public class NlxvaExtractionRequest
{
public string EmailHtml { get; set; } = string.Empty;
public List<NlxvaChatMessage> Messages { get; set; } = new();
}
// TradeItem.cs — snake_case JSON for downstream systems
public class TradeItem
{
[JsonPropertyName("valuedate")] public string? Valuedate { get; set; }
[JsonPropertyName("counterparty")] public string? Counterparty { get; set; }
[JsonPropertyName("legal_entity")] public string? LegalEntity { get; set; }
[JsonPropertyName("trade_id")] public long TradeId { get; set; }
[JsonPropertyName("display_ccy")] public string? DisplayCcy { get; set; }
[JsonPropertyName("pv")] public double Pv { get; set; }
[JsonPropertyName("breakclause")] public string? Breakclause { get; set; }
}
// NlxvaExtractionResult.cs
public class NlxvaExtractionResult
{
[JsonPropertyName("items")]
public List<TradeItem> Items { get; set; } = new();
}
// NlxvaValidationResult.cs
public class NlxvaValidationResult
{
public bool IsValid { get; set; }
public List<string> Errors { get; set; } = new();
public List<NlxvaCandidateMatch>? Candidates { get; set; }
}
public class NlxvaCandidateMatch
{
public string Name { get; set; } = string.Empty;
public string LegalEntity { get; set; } = string.Empty;
}
```
### SSE Wire Format
```
data: {"text":"token here"}\n\n ← per token
data: [DONE]\n\n ← stream complete
data: {"error":"message"}\n\n ← on failure (followed by [DONE])
```
### Config keys (appsettings.json)
```json
{
"NlxvaPricer": {
"LlmBaseUrl": "http://localhost:8317/v1",
"LlmModel": "claude-sonnet-4-6",
"LlmApiKey": "not-needed",
"FewShotPath": "examples/extraction"
},
"ExternalApis": {
"CounterpartyBaseUrl": "http://localhost:5000/api/counterparty",
"TradeBaseUrl": "http://localhost:5000/api/trade",
"CurrencyBaseUrl": "http://localhost:5000/api/currency"
}
}
```
## Critical Patterns
### 1. SSE streaming in Blazor WASM — DO NOT use `reader.EndOfStream`
**Why:** `EndOfStream` performs a synchronous peek read. Blazor WASM's async streaming pipeline
does not support synchronous reads — it will hang or throw.
**Copy this pattern:**
```csharp
httpRequest.SetBrowserResponseStreamingEnabled(true);
using var response = await _httpClient.SendAsync(
httpRequest, HttpCompletionOption.ResponseHeadersRead);
response.EnsureSuccessStatusCode();
using var stream = await response.Content.ReadAsStreamAsync();
using var reader = new StreamReader(stream);
string? line;
while ((line = await reader.ReadLineAsync()) != null) // ← NOT EndOfStream
{
if (!line.StartsWith("data: ")) continue;
var data = line.Substring(6);
if (data == "[DONE]") yield break;
// parse {"text":"..."} and yield
}
```
`SetBrowserResponseStreamingEnabled(true)` is a Blazor WASM extension that tells the browser Fetch API
to expose the response as a ReadableStream. Without it, the browser buffers the entire response.
### 2. Semantic Kernel base URL must include `/v1`
**Why:** The OpenAI SDK appends `chat/completions` directly to the base URL.
Without `/v1`, requests hit `/chat/completions` instead of `/v1/chat/completions` → 404.
```csharp
builder.Services.AddOpenAIChatCompletion(
modelId: model,
endpoint: new Uri("http://localhost:8317/v1"), // ← MUST include /v1
apiKey: "not-needed");
```
### 3. Layout height depends on AppBar height
**Why:** CRC uses a regular AppBar (64px), not Dense (48px). Magic CSS values must match.
```css
::deep .tab-container {
height: calc(100vh - 64px); /* 64px = CRC regular AppBar height */
}
```
If CRC uses `100dvh` elsewhere, prefer that over `100vh` for mobile viewport correctness.
### 4. Markdown render caching during streaming
**Why:** Without caching, every `StateHasChanged()` during streaming re-runs Markdig on ALL messages,
causing visible lag as conversation grows. Only the streaming message should re-render.
```csharp
private readonly Dictionary<NlxvaChatMessage, string> _renderedHtmlCache = new();
private string GetRenderedHtml(NlxvaChatMessage message)
{
if (_renderedHtmlCache.TryGetValue(message, out var cached))
return cached;
return Markdown.ConvertToHtml(message.Content);
}
// In finally block after streaming completes:
_renderedHtmlCache[assistantMessage] = Markdown.ConvertToHtml(assistantMessage.Content);
```
### 5. ExtractionPlugin tool results must be serialized JSON strings
**Why:** SK passes the return value as a string to the LLM. The LLM needs structured JSON
to reason about validation results, error messages, and candidate lists.
```csharp
[KernelFunction("lookup_counterparty")]
[Description("Looks up counterparty candidates by name...")]
public async Task<string> LookupCounterparty(string name)
{
var result = new NlxvaValidationResult();
// ... populate result ...
return JsonSerializer.Serialize(result); // ← return JSON string, not object
}
```
### 6. Per-request plugin import (not global)
**Why:** Plugins that depend on scoped services (typed HttpClients) must be imported per-request,
not registered globally on the Kernel at startup.
```csharp
var extractionPlugin = HttpContext.RequestServices.GetRequiredService<ExtractionPlugin>();
_kernel.ImportPluginFromObject(extractionPlugin, "Extraction");
```
### 7. C# yield cannot appear inside try-catch
**Why:** Language restriction. SSE parsing needs to parse JSON (can throw) and yield (can't be in try).
Solution: parse into local variables first, yield outside try block.
```csharp
string? parsedText = null;
string? parsedError = null;
try
{
using var doc = JsonDocument.Parse(data);
var root = doc.RootElement;
if (root.TryGetProperty("error", out var err))
parsedError = err.GetString();
else if (root.TryGetProperty("text", out var txt))
parsedText = txt.GetString();
}
catch (JsonException) { /* skip malformed */ }
if (parsedError != null)
throw new HttpRequestException($"API error: {parsedError}");
if (!string.IsNullOrEmpty(parsedText))
yield return parsedText;
```
## Wiring
### CRC.Server DI registration order (add to existing Program.cs / Startup.cs)
```csharp
// 1. Semantic Kernel — OpenAI-compatible connector
var llmBaseUrl = builder.Configuration["NlxvaPricer:LlmBaseUrl"] ?? "http://localhost:8317/v1";
var llmModel = builder.Configuration["NlxvaPricer:LlmModel"] ?? "claude-sonnet-4-6";
builder.Services.AddOpenAIChatCompletion(
modelId: llmModel,
endpoint: new Uri(llmBaseUrl),
apiKey: builder.Configuration["NlxvaPricer:LlmApiKey"] ?? "not-needed");
builder.Services.AddKernel();
// 2. External API typed HttpClients
builder.Services.AddHttpClient<CounterpartyApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ExternalApis:CounterpartyBaseUrl"]
?? "http://localhost:5000/api/counterparty"));
builder.Services.AddHttpClient<TradeApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ExternalApis:TradeBaseUrl"]
?? "http://localhost:5000/api/trade"));
builder.Services.AddHttpClient<CurrencyApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ExternalApis:CurrencyBaseUrl"]
?? "http://localhost:5000/api/currency"));
// 3. FewShotService (singleton — loads examples once at startup)
var fewShotPath = builder.Configuration["NlxvaPricer:FewShotPath"] ?? "examples/extraction";
var fewShotAbsPath = Path.IsPathRooted(fewShotPath)
? fewShotPath
: Path.Combine(builder.Environment.ContentRootPath, fewShotPath);
builder.Services.AddSingleton(new FewShotService(fewShotAbsPath));
// 4. ExtractionPlugin (scoped — depends on scoped HttpClients)
builder.Services.AddScoped<ExtractionPlugin>();
```
### CRC.Client DI registration (add to existing Program.cs)
```csharp
// Typed HttpClient for the NL XVA Pricer API
builder.Services.AddHttpClient<NlxvaPricerApiClient>(c =>
c.BaseAddress = new Uri(builder.Configuration["ApiBaseUrl"]
?? "https://localhost:7100/"));
// Markdown rendering (singleton — thread-safe, reusable)
builder.Services.AddSingleton<MarkdownService>();
```
### CRC.Client NavMenu (add new MudNavLink)
```razor
<MudNavLink Href="/nlxva-pricer"
Icon="@Icons.Material.Filled.SmartToy"
Match="NavLinkMatch.All">
NL XVA Pricer
</MudNavLink>
```
### CRC.Client index.html (add JS reference)
```html
<script src="js/file-drop.js"></script>
```
### CRC.Server CORS (if not already allowing the client origin)
Ensure the CORS policy allows the CRC.Client origin for the new endpoints.
### Examples folder
Copy the `examples/extraction/` folder to the CRC.Server project root:
```
examples/extraction/
├── instruction-template.txt
└── few-shot/
├── 01/
│ ├── input.html
│ └── output.json
├── 02/
│ ├── input.html
│ └── output.json
└── 03/
├── input.html
└── output.json
```
## Behavior
- **Extraction mode routing**: When an email is uploaded, `_isExtractionMode = true`. All subsequent text messages route to `/extract` (not `/chat`) until "New Chat" resets
- **Follow-up disambiguation**: The extraction endpoint receives full conversation history (email + all prior exchanges) so the agent has context for disambiguation
- **Upload message**: File upload adds a user message `[Uploaded: filename.html]` to the chat before streaming the extraction response
- **File validation**: Only `.html` files accepted (both drag-drop and file picker). Others show MudAlert warning
- **Streaming guard**: Input field, send button, upload button, and drop zone all disabled during streaming
- **Multi-turn context**: General chat sends full conversation history with every request
- **System prompt**: Only used for general chat, NOT for extraction (extraction uses fixed instruction template)
- **Model settings**: Only used for general chat, NOT for extraction
- **Settings persistence**: In-memory only (lost on page refresh) — acceptable for a debugging/iteration tool
- **DotNetObjectReference disposal**: Chat page implements IDisposable to dispose the JS interop reference
## Few-Shot Instruction Template
The instruction template defines the extraction task. Content:
```
You are a trade data extraction agent. Your task is to extract structured trade data
from sales emails (typically CVA pricing requests) and return the result as JSON.
## Output Schema
Return a JSON object with an "items" array. Each item has:
- valuedate (string): dd/MM/yyyy format
- counterparty (string): full legal name from email
- trade_id (integer): Murex trade ID
- display_ccy (string): ISO currency code (£→GBP, $→USD, €→EUR)
- pv (number): plain number, no formatting
- breakclause (string): "Y" or "N" (default "N")
legal_entity is NOT included — populated later via lookup tool.
## Mapping Rules
1. FLATTEN: Each leg with unique Murex ID → separate item
2. DATE: Parse from context (e.g., "OB 27/11/2025" → "27/11/2025")
3. COUNTERPARTY: Full legal name exactly as written
4. CURRENCY: From PV column header (£→GBP, $→USD, €→EUR)
5. PV: Strip commas/symbols, plain number
6. BREAKCLAUSE: Default "N", only "Y" if explicitly mentioned
## After Extraction
Use tools: lookup_counterparty, validate_trade, validate_currency, validate_schema.
If multiple candidates, present numbered list and ask user to select.
```
---
## Compression Stats
- Source code: ~3,200 lines across 25+ files
- This spec: ~350 lines
- Compression ratio: ~9:1
- Estimated typing: ~12,000 characters (vs ~110,000 for full code)