fix: update export bundle for Azure OpenAI and add streaming diagnostics

Replace CLIProxyAPI/local proxy references with Azure OpenAI using
DefaultAzureCredential and tenant ID auth. Add Critical Pattern #8
for SSE buffering diagnostics with timestamped curl test. Add
streaming verification tasks (T6b, T15) and troubleshooting entries
for Azure AD auth, RBAC, response compression, and proxy buffering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
local
2026-04-07 01:42:38 +01:00
parent d46b179221
commit 956ec243c5
3 changed files with 256 additions and 59 deletions

View File

@@ -58,6 +58,8 @@ This feature is a GUEST in CRC. Existing code, patterns, and conventions take ab
Add to `CRC.Server`:
- `Microsoft.SemanticKernel` (latest stable, >=1.x)
- `Microsoft.SemanticKernel.Connectors.AzureOpenAI` (for Azure OpenAI connector)
- `Azure.Identity` (for `DefaultAzureCredential` — CRC may already have this)
- `Markdig` 1.1.1 (if CRC.Client doesn't already have it — check first)
No new packages for CRC.Client or CRC.Shared (MudBlazor already present).
@@ -73,7 +75,7 @@ CRC.Server (ASP.NET Core)
├── NlxvaPricerController
│ ├── POST /api/nlxva-pricer/chat (general chat)
│ └── POST /api/nlxva-pricer/extract (email extraction)
│ Uses: Semantic Kernel → CLIProxyAPI (OpenAI-compatible proxy)
│ Uses: Semantic Kernel → Azure OpenAI (via DefaultAzureCredential)
│ Uses: ExtractionPlugin (tool calling)
│ Uses: FewShotService (example loading)
├── Services/
@@ -232,9 +234,9 @@ data: {"error":"message"}\n\n ← on failure (followed by [DONE])
```json
{
"NlxvaPricer": {
"LlmBaseUrl": "http://localhost:8317/v1",
"LlmModel": "claude-sonnet-4-6",
"LlmApiKey": "not-needed",
"AzureOpenAIEndpoint": "https://your-resource.openai.azure.com/",
"DeploymentName": "gpt4o-prod",
"TenantId": "<your-azure-ad-tenant-id>",
"FewShotPath": "examples/extraction"
},
"ExternalApis": {
@@ -245,6 +247,11 @@ data: {"error":"message"}\n\n ← on failure (followed by [DONE])
}
```
If using API key auth instead of Azure AD, replace `TenantId` with:
```json
"ApiKey": "<your-azure-openai-api-key>"
```
## Critical Patterns
### 1. SSE streaming in Blazor WASM — DO NOT use `reader.EndOfStream`
@@ -276,16 +283,33 @@ while ((line = await reader.ReadLineAsync()) != null) // ← NOT EndOfStream
`SetBrowserResponseStreamingEnabled(true)` is a Blazor WASM extension that tells the browser Fetch API
to expose the response as a ReadableStream. Without it, the browser buffers the entire response.
### 2. Semantic Kernel base URL must include `/v1`
### 2. Azure OpenAI: use deployment name, NOT model name; NO `/v1` suffix
**Why:** The OpenAI SDK appends `chat/completions` directly to the base URL.
Without `/v1`, requests hit `/chat/completions` instead of `/v1/chat/completions` → 404.
**Why:** Azure OpenAI uses `AddAzureOpenAIChatCompletion()`, not `AddOpenAIChatCompletion()`.
The endpoint is your Azure resource URL (no `/v1` — the Azure SDK constructs the path internally).
The `deploymentName` is the name you gave the deployment in Azure portal, not the model name.
Auth uses `DefaultAzureCredential` with the tenant ID, not an API key.
```csharp
builder.Services.AddOpenAIChatCompletion(
modelId: model,
endpoint: new Uri("http://localhost:8317/v1"), // ← MUST include /v1
apiKey: "not-needed");
using Azure.Identity;
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod",
endpoint: builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
?? "https://your-resource.openai.azure.com/",
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions
{
TenantId = builder.Configuration["NlxvaPricer:TenantId"]
}));
```
If using API key instead of Azure AD:
```csharp
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: "gpt4o-prod",
endpoint: "https://your-resource.openai.azure.com/",
apiKey: builder.Configuration["NlxvaPricer:ApiKey"]);
```
### 3. Layout height depends on AppBar height
@@ -371,18 +395,72 @@ if (!string.IsNullOrEmpty(parsedText))
yield return parsedText;
```
### 8. SSE response buffering — verify both streaming hops
**Why:** The architecture has two streaming hops: Azure OpenAI → CRC.Server → Browser.
If anything buffers in either hop, the user sees no tokens until the full response completes.
Common buffers: response compression middleware, reverse proxies (NGINX/IIS), Azure API Management.
**Diagnostic endpoint (add temporarily, remove after verifying):**
```csharp
[HttpGet("stream-test")]
public async Task StreamTest()
{
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // NGINX hint
var chatService = _kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddUserMessage("Count from 1 to 10, one number per line.");
var sw = System.Diagnostics.Stopwatch.StartNew();
await foreach (var chunk in chatService.GetStreamingChatMessageContentsAsync(history))
{
if (!string.IsNullOrEmpty(chunk.Content))
{
await Response.WriteAsync($"data: [{sw.ElapsedMilliseconds}ms] {chunk.Content}\n\n");
await Response.Body.FlushAsync();
}
}
await Response.WriteAsync("data: [DONE]\n\n");
}
```
**Test with:** `curl -N https://localhost:7100/api/nlxva-pricer/stream-test`
- Timestamps spread over seconds = streaming works
- All timestamps clustered at the end = something is buffering
**If CRC.Server uses `UseResponseCompression()`, exclude SSE:**
```csharp
Response.Headers["Content-Encoding"] = "identity"; // opt out per-response
```
**Response headers to always set on SSE endpoints:**
```csharp
Response.ContentType = "text/event-stream";
Response.Headers["Cache-Control"] = "no-cache";
Response.Headers["X-Accel-Buffering"] = "no"; // prevents NGINX buffering
```
## Wiring
### CRC.Server DI registration order (add to existing Program.cs / Startup.cs)
```csharp
// 1. Semantic Kernel — OpenAI-compatible connector
var llmBaseUrl = builder.Configuration["NlxvaPricer:LlmBaseUrl"] ?? "http://localhost:8317/v1";
var llmModel = builder.Configuration["NlxvaPricer:LlmModel"] ?? "claude-sonnet-4-6";
builder.Services.AddOpenAIChatCompletion(
modelId: llmModel,
endpoint: new Uri(llmBaseUrl),
apiKey: builder.Configuration["NlxvaPricer:LlmApiKey"] ?? "not-needed");
// 1. Semantic Kernel — Azure OpenAI connector with Azure AD auth
using Azure.Identity;
var azureEndpoint = builder.Configuration["NlxvaPricer:AzureOpenAIEndpoint"]
?? "https://your-resource.openai.azure.com/";
var deploymentName = builder.Configuration["NlxvaPricer:DeploymentName"] ?? "gpt4o-prod";
var tenantId = builder.Configuration["NlxvaPricer:TenantId"];
builder.Services.AddAzureOpenAIChatCompletion(
deploymentName: deploymentName,
endpoint: azureEndpoint,
credentials: new DefaultAzureCredential(
new DefaultAzureCredentialOptions { TenantId = tenantId }));
builder.Services.AddKernel();
// 2. External API typed HttpClients