Files
AgenticCode/openspec/changes/archive/2026-04-06-update-extraction-schema/proposal.md
local 5b027eb0db feat: add extraction schema, sidebar nav, few-shot prompting, and prompt settings
Overhaul extraction pipeline with new TradeItem model, conversation flow,
and dedicated extraction endpoint. Add sidebar navigation with NavMenu
component and landing page. Introduce few-shot prompting service and
tests. Add prompt settings and email upload specs. Update OpenSpec
tooling with improved export-spec and extract-feature commands. Archive
completed changes and export full specs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:39:23 +01:00

29 lines
2.5 KiB
Markdown

## Why
The current extraction schema (`ExtractedFields`) uses placeholder fields (Client, Project, Hours, Rate, Currency, Date) that don't match the real domain. The actual use case is CVA (Credit Valuation Adjustment) trade extraction from sales emails — parsing HTML emails into structured trade items with fields like counterparty, trade_id, pv, and legal_entity. The single `ExtractionPlugin.ValidateExtractedFields()` method also needs to be replaced with multiple tools that wrap existing external APIs for counterparty lookup, trade validation, and other checks.
## What Changes
- **Replace `ExtractedFields.cs`** with real domain models: `ExtractionResult` (wrapper) and `TradeItem` (per-trade fields: valuedate, counterparty, legal_entity, trade_id, display_ccy, pv, breakclause)
- **Replace `ExtractionPlugin.cs`** single validation method with 3-5 SK plugin methods, each wrapping an existing external API (counterparty lookup, trade validation, currency validation, schema validation)
- **Update `ValidationResult.cs`** to support richer results — candidate lists for disambiguation, not just pass/fail
- **Add typed HttpClients** for the external validation/lookup APIs, configured via `appsettings.json`
- **Update existing tests** that reference the old `ExtractedFields` and `ExtractionPlugin`
## Capabilities
### New Capabilities
- `extraction-schema`: Defines the real TradeItem schema, ExtractionResult wrapper, and the mapping rules from email content to structured output (date format, flattening swap legs, breakclause defaults)
- `extraction-tools`: Defines the external API tool plugins — counterparty lookup (with disambiguation), trade validation, currency validation, and final schema validation
### Modified Capabilities
- `agent-extraction`: Update requirements to reference the real schema (TradeItem) instead of generic "predefined fields", and add disambiguation workflow where tool results require user selection (e.g., counterparty/legal_entity tuples)
## Impact
- **Shared models**: `ExtractedFields.cs` replaced — **BREAKING** for any code referencing old fields
- **API plugins**: `ExtractionPlugin.cs` rewritten with new method signatures — **BREAKING** for existing tool calling behavior
- **External dependencies**: New HTTP calls to existing external APIs (counterparty, trade, currency)
- **Configuration**: New `appsettings.json` entries for external API base URLs
- **Tests**: Existing extraction-related tests need rewriting against new schema and tools