Files
AgenticCode/openspec/specs/extraction-schema/spec.md
local 5b027eb0db feat: add extraction schema, sidebar nav, few-shot prompting, and prompt settings
Overhaul extraction pipeline with new TradeItem model, conversation flow,
and dedicated extraction endpoint. Add sidebar navigation with NavMenu
component and landing page. Introduce few-shot prompting service and
tests. Add prompt settings and email upload specs. Update OpenSpec
tooling with improved export-spec and extract-feature commands. Archive
completed changes and export full specs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:39:23 +01:00

3.0 KiB

Purpose

Define the TradeItem extraction schema, ExtractionResult wrapper, and mapping rules for converting sales email content into structured trade data.

Requirements

Requirement: TradeItem schema

The system SHALL define a TradeItem class with the following fields representing a single trade leg extracted from a sales email:

  • valuedate (string, dd/MM/yyyy format)
  • counterparty (string, full legal name as it appears in the email)
  • legal_entity (string, nullable — populated after counterparty disambiguation via lookup tool)
  • trade_id (long, Murex trade identifier)
  • display_ccy (string, ISO currency code e.g. "GBP", "USD")
  • pv (double, present value)
  • breakclause (string, "Y" or "N")

JSON serialization SHALL use snake_case property names via [JsonPropertyName] attributes.

Scenario: All fields populated

  • WHEN the extraction agent produces a TradeItem with all fields
  • THEN the JSON output contains all seven fields with snake_case keys and correct types
  • WHEN the extraction agent produces a TradeItem before counterparty lookup
  • THEN the legal_entity field is null and all other fields are populated

Requirement: ExtractionResult wrapper

The system SHALL define an ExtractionResult class containing a List<TradeItem> Items property. All extraction output from a single email SHALL be wrapped in this object.

Scenario: Single email with multiple trade legs

  • WHEN an email contains two swaps with two legs each (4 trades total)
  • THEN the ExtractionResult contains an items array with 4 TradeItem objects

Scenario: JSON output structure

  • WHEN the ExtractionResult is serialized to JSON
  • THEN the output has the shape {"items": [{"valuedate": "...", ...}, ...]}

Requirement: Extraction mapping rules

The extraction agent SHALL follow these mapping rules when converting email content to TradeItems:

  • Each swap leg (identified by a unique Murex trade ID) becomes a separate TradeItem
  • The valuedate SHALL be parsed from date references in the email (e.g., "OB 27/11/2025") and formatted as dd/MM/yyyy
  • The counterparty SHALL be the full legal entity name as stated in the email prose
  • The display_ccy SHALL be derived from the currency symbol or code in the email (e.g., "£" or "PV (£)" → "GBP")
  • The breakclause SHALL default to "N" if not explicitly mentioned in the email
  • The pv SHALL be the numeric present value without formatting (no commas, no currency symbols)

Scenario: Flatten multi-leg swap into individual items

  • WHEN the email contains a swap with Coupon Leg (Murex 79353083) and APD leg (Murex 79353084)
  • THEN the output contains two separate TradeItems, one per Murex ID

Scenario: Currency symbol to ISO code mapping

  • WHEN the email shows PV values in "PV (£)" column
  • THEN the display_ccy field is set to "GBP"

Scenario: Default breakclause

  • WHEN the email does not mention break clauses
  • THEN all TradeItems have breakclause set to "N"