Illustrative screening example — not client data

Readiness probe example

A screening step, not a full SDVM diagnosis

This example shows how a workflow might be assessed qualitatively before a controlled pilot. It does not represent a completed external pilot or proven improvement.

Workflow candidate

Label: Recurring agentic bugfix workflow (synthetic)
Recurrence: Several comparable runs per week on similar issue types
Owner: One technical lead available to review reports and test one intervention

Evidence received

Langfuse export for six recent runs (anonymized metadata summary)
Observable repair loops and handoff summaries in trace spans
No raw customer payloads or credentials included

Comparability assessment

Partial. PRE baseline runs are broadly comparable, but two POST candidates changed model temperature and issue scope, weakening direct PRE/POST alignment until capture is tightened.

Signal coverage

Repair pressure and skipped-step signals: present in most runs
Handoff boundaries: partially captured; some tool calls lack stage labels
Evidence sufficiency for edge-level tuning: moderate, not definitive

Risks / limitations

Small sample size for POST follow-up
Intervention surface not yet frozen to one workflow edge
Screening does not establish causal proof or guaranteed improvement

Readiness classification

Partial — sufficient for a scoped PRE diagnostic and readiness conversation; insufficient for immediate POST/DELTA closure without refining capture and comparability.

Qualitative options used at screening: sufficient, partial, or insufficient. No numeric readiness score is assigned.

Recommended next step

Run a baseline PRE on the available traces, tighten handoff/checkpoint capture on the flagged edge, and retest one narrow intervention before expanding workflow scope.

Decision options: stabilize, retune, refine capture, or expand scope.

Review the pilot intake template · Back to pilot overview