Files
AgenticCode/openspec/changes/archive/2026-04-06-update-extraction-schema/proposal.md
local 5b027eb0db feat: add extraction schema, sidebar nav, few-shot prompting, and prompt settings
Overhaul extraction pipeline with new TradeItem model, conversation flow,
and dedicated extraction endpoint. Add sidebar navigation with NavMenu
component and landing page. Introduce few-shot prompting service and
tests. Add prompt settings and email upload specs. Update OpenSpec
tooling with improved export-spec and extract-feature commands. Archive
completed changes and export full specs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:39:23 +01:00

2.5 KiB

Why

The current extraction schema (ExtractedFields) uses placeholder fields (Client, Project, Hours, Rate, Currency, Date) that don't match the real domain. The actual use case is CVA (Credit Valuation Adjustment) trade extraction from sales emails — parsing HTML emails into structured trade items with fields like counterparty, trade_id, pv, and legal_entity. The single ExtractionPlugin.ValidateExtractedFields() method also needs to be replaced with multiple tools that wrap existing external APIs for counterparty lookup, trade validation, and other checks.

What Changes

  • Replace ExtractedFields.cs with real domain models: ExtractionResult (wrapper) and TradeItem (per-trade fields: valuedate, counterparty, legal_entity, trade_id, display_ccy, pv, breakclause)
  • Replace ExtractionPlugin.cs single validation method with 3-5 SK plugin methods, each wrapping an existing external API (counterparty lookup, trade validation, currency validation, schema validation)
  • Update ValidationResult.cs to support richer results — candidate lists for disambiguation, not just pass/fail
  • Add typed HttpClients for the external validation/lookup APIs, configured via appsettings.json
  • Update existing tests that reference the old ExtractedFields and ExtractionPlugin

Capabilities

New Capabilities

  • extraction-schema: Defines the real TradeItem schema, ExtractionResult wrapper, and the mapping rules from email content to structured output (date format, flattening swap legs, breakclause defaults)
  • extraction-tools: Defines the external API tool plugins — counterparty lookup (with disambiguation), trade validation, currency validation, and final schema validation

Modified Capabilities

  • agent-extraction: Update requirements to reference the real schema (TradeItem) instead of generic "predefined fields", and add disambiguation workflow where tool results require user selection (e.g., counterparty/legal_entity tuples)

Impact

  • Shared models: ExtractedFields.cs replaced — BREAKING for any code referencing old fields
  • API plugins: ExtractionPlugin.cs rewritten with new method signatures — BREAKING for existing tool calling behavior
  • External dependencies: New HTTP calls to existing external APIs (counterparty, trade, currency)
  • Configuration: New appsettings.json entries for external API base URLs
  • Tests: Existing extraction-related tests need rewriting against new schema and tools