Synthetic demo · fully reproducible

Copilot Studio Feedback Root-Cause Analysis

Why users are unhappy

Every thumbs-down, sorted into one of nine root causes. Telemetry-backed causes are decided by hard signal; the rest are inferred from the transcript and the user's comment.

Connector health

Failure rate per data connector across every diagnosed conversation. This is where the access and availability failures originate.

Weekly trend

Thumbs-down volume per ISO week, with the leading cause that week.

By agent

Which deployed agents draw the most negative feedback, and their top cause.

Diagnoses, with evidence

A sample of real diagnoses. Each one names a primary cause, a confidence level, and the concrete evidence the engine used. Filter to explore.

What the evaluation does and does not prove

The honest part. This demo grades itself against synthetic ground truth using a deterministic stand-in for the LLM. That makes some of the score real and some of it circular, and the chart below separates the two.

Genuinely validated

The three telemetry-backed causes are decided by heuristics that read real schema fields: status codes, permission-denial records, empty retrieval sets, never the ground-truth labels. Disable the LLM entirely and these three still hit perfect recall. This path, plus all the ingest, triangulation and aggregation plumbing, is exercised end to end and is real.

Not independently validated

The other six classes are decided by the mock LLM, which is pattern-matched to the same structural anchors the generator injects. Their recall measures generator-vs-mock agreement, not classifier accuracy. Stub the LLM out and all six fall to zero, dragging macro recall from to . Proving real accuracy on those six needs hand-labeled transcripts, not this loop.

Confusion matrix

Rows are the true cause; columns are what the engine predicted. A clean diagonal is the goal.

How it works

A four-stage pipeline turns raw feedback into a triaged diagnosis. The analysis core depends only on typed schemas, so the synthetic source swaps for Dataverse, Microsoft Graph and Azure Application Insights with no change to the logic.

Loading report…