# Feature: Natural Language XVA Pricer ## Target: CRC (Blazor WASM Hosted / ASP.NET Core / MudBlazor / .NET 8.0) ## Source: ChatAgent — cumulative export of all 12 changes ## Includes: chat-ui, chat-streaming, semantic-kernel, multi-turn, rich-text, sidebar-nav, prompt-settings, extraction-schema, extraction-tools, few-shot-prompting, extraction-endpoint, email-upload ## Skipped: migrate-claude-md-to-openspec (documentation only), add-test-coverage (adapt to CRC test conventions separately) --- ## Integration Rule This feature is a GUEST in CRC. Existing code, patterns, and conventions take absolute precedence. - **DO NOT** modify existing files, components, layouts, services, routing, or DI registrations in CRC - **DO NOT** replace existing patterns (e.g., if CRC uses a different HttpClient pattern, use theirs) - **DO** add new files, new nav links, new routes, new DI registrations - **DO** conform to CRC naming conventions: `E`-prefix enums, `I`-prefix interfaces, `*Dto`/`*Request`/`*Response` DTOs, PascalCase constants, `{Subject}Test` test classes - **DO** use CRC.Shared for DTOs (not a new shared project) - If a task conflicts with existing CRC code, **STOP and ask the user** - If CRC already has an equivalent service (HttpClient wrapper, markdown renderer), **use the existing one** ### Adapt-to-target notes - CRC uses `CRC.Server` (not a standalone API project) — add controller and services there - CRC uses `CRC.Client` — add pages, layout changes, client services there - CRC uses `CRC.Shared` — add DTOs there - CRC uses Scrutor for DI assembly scanning — register new services compatibly - CRC uses Fluxor for client state — this feature uses local component state (no Fluxor needed), which is fine for an isolated page - CRC uses Serilog — use `ILogger` via DI (Serilog handles the sink) - CRC uses Azure AD auth in prod, DevAuth in dev — add `[Authorize]` if CRC controllers require it - CRC uses `gv_web_config.csv` as primary config — put LLM config in `appsettings.json` (secondary config) where CRC already stores Serilog/DevAuth settings - CRC AppBar is regular height (64px), not Dense (48px) — adjust CSS calc accordingly ## Target Layout ``` +------------------------------------------------------------------+ | CRC AppBar (64px, blue, Elevation 1) | | [=] CRC 0.0.0 APR-CRC-PROD-LDN-DEV | +------+-----------------------------------------------------------+ | Drawer| MudMainContent | | Home | | | Pricer| (routed page content) | | Mkt | | | XVA | | | Sales | | |>NLPric| <-- NEW: "NL XVA Pricer" nav item, route /nlxva-pricer | | | | +------+-----------------------------------------------------------+ ``` - Feature name: **NL XVA Pricer** (short for Natural Language XVA Pricer) - Route: `/nlxva-pricer` - Navigation: new MudNavLink in the existing NavMenu component - Icon: `Icons.Material.Filled.SmartToy` - AppBar height: 64px (CRC uses regular, NOT Dense) - CSS viewport calc: `calc(100vh - 64px)` (NOT 48px) ## Packages Add to `CRC.Server`: - `Microsoft.SemanticKernel` (latest stable, >=1.x) - `Microsoft.SemanticKernel.Connectors.AzureOpenAI` (for Azure OpenAI connector) - `Azure.Identity` (for `DefaultAzureCredential` — CRC may already have this) - `Markdig` 1.1.1 (if CRC.Client doesn't already have it — check first) No new packages for CRC.Client or CRC.Shared (MudBlazor already present). ## Architecture ``` CRC.Client (WASM) | | HTTP REST (SSE streaming) | CRC.Server (ASP.NET Core) ├── NlxvaPricerController │ ├── POST /api/nlxva-pricer/chat (general chat) │ └── POST /api/nlxva-pricer/extract (email extraction) │ Uses: Semantic Kernel → Azure OpenAI (via DefaultAzureCredential) │ Uses: ExtractionPlugin (tool calling) │ Uses: FewShotService (example loading) ├── Services/ │ ├── FewShotService (singleton, loads examples at startup) │ ├── CounterpartyApiClient (typed HttpClient) │ ├── TradeApiClient (typed HttpClient) │ └── CurrencyApiClient (typed HttpClient) ├── Plugins/ │ └── ExtractionPlugin ([KernelFunction] tools) ├── CRC.Shared (DTOs) └── CRC.Component (if reusable Blazor components needed) ``` Two endpoints, same SSE streaming contract. General chat supports system prompt + model settings. Extraction uses few-shot prefix (not user system prompt) and extraction-specific tools. ## Components ### Page: `NlxvaPricer.razor` → `CRC.Client/Pages/NlxvaPricer.razor` - Route: `@page "/nlxva-pricer"` - MudTabs with 3 panels: Chat, System Prompt, Model Settings (KeepPanelsAlive=true) - Chat panel: message list (scrollable), input area (text field + send + upload button), drag-drop zone - Extraction mode: tracked by `_isExtractionMode` bool; routes subsequent messages to extract endpoint - Streaming: consumes `IAsyncEnumerable`, appends token-by-token to assistant message - Markdown rendering: assistant messages rendered via MarkdownService + MarkupString - HTML render cache: `Dictionary` avoids re-running Markdig on completed messages - JS interop: auto-scroll, drag-and-drop file handling via `file-drop.js` ### Client service: `NlxvaPricerApiClient` → `CRC.Client/Services/NlxvaPricerApiClient.cs` - Typed HttpClient wrapper - `SendChatStreamingAsync(NlxvaChatRequest)` → POST /api/nlxva-pricer/chat, returns `IAsyncEnumerable` - `SendExtractionStreamingAsync(NlxvaExtractionRequest)` → POST /api/nlxva-pricer/extract, returns `IAsyncEnumerable` - SSE parsing: read line-by-line, extract `data: {"text":"..."}` events, yield text deltas, stop at `[DONE]` ### Client service: `MarkdownService` → `CRC.Client/Services/MarkdownService.cs` - Markdig pipeline with `UseAdvancedExtensions()` - HTML sanitization via tag/attribute allowlist (p, h1-h6, strong, em, code, pre, ul, ol, li, a[href], table/thead/tbody/tr/th/td, br, blockquote) - Strips ` ``` ### CRC.Server CORS (if not already allowing the client origin) Ensure the CORS policy allows the CRC.Client origin for the new endpoints. ### Examples folder Copy the `examples/extraction/` folder to the CRC.Server project root: ``` examples/extraction/ ├── instruction-template.txt └── few-shot/ ├── 01/ │ ├── input.html │ └── output.json ├── 02/ │ ├── input.html │ └── output.json └── 03/ ├── input.html └── output.json ``` ## Behavior - **Extraction mode routing**: When an email is uploaded, `_isExtractionMode = true`. All subsequent text messages route to `/extract` (not `/chat`) until "New Chat" resets - **Follow-up disambiguation**: The extraction endpoint receives full conversation history (email + all prior exchanges) so the agent has context for disambiguation - **Upload message**: File upload adds a user message `[Uploaded: filename.html]` to the chat before streaming the extraction response - **File validation**: Only `.html` files accepted (both drag-drop and file picker). Others show MudAlert warning - **Streaming guard**: Input field, send button, upload button, and drop zone all disabled during streaming - **Multi-turn context**: General chat sends full conversation history with every request - **System prompt**: Only used for general chat, NOT for extraction (extraction uses fixed instruction template) - **Model settings**: Only used for general chat, NOT for extraction - **Settings persistence**: In-memory only (lost on page refresh) — acceptable for a debugging/iteration tool - **DotNetObjectReference disposal**: Chat page implements IDisposable to dispose the JS interop reference ## Few-Shot Instruction Template The instruction template defines the extraction task. Content: ``` You are a trade data extraction agent. Your task is to extract structured trade data from sales emails (typically CVA pricing requests) and return the result as JSON. ## Output Schema Return a JSON object with an "items" array. Each item has: - valuedate (string): dd/MM/yyyy format - counterparty (string): full legal name from email - trade_id (integer): Murex trade ID - display_ccy (string): ISO currency code (£→GBP, $→USD, €→EUR) - pv (number): plain number, no formatting - breakclause (string): "Y" or "N" (default "N") legal_entity is NOT included — populated later via lookup tool. ## Mapping Rules 1. FLATTEN: Each leg with unique Murex ID → separate item 2. DATE: Parse from context (e.g., "OB 27/11/2025" → "27/11/2025") 3. COUNTERPARTY: Full legal name exactly as written 4. CURRENCY: From PV column header (£→GBP, $→USD, €→EUR) 5. PV: Strip commas/symbols, plain number 6. BREAKCLAUSE: Default "N", only "Y" if explicitly mentioned ## After Extraction Use tools: lookup_counterparty, validate_trade, validate_currency, validate_schema. If multiple candidates, present numbered list and ask user to select. ``` --- ## Compression Stats - Source code: ~3,200 lines across 25+ files - This spec: ~350 lines - Compression ratio: ~9:1 - Estimated typing: ~12,000 characters (vs ~110,000 for full code)