Illustrative screening example โ not client data
Readiness probe example
A screening step, not a full SDVM diagnosis
This example shows how a workflow might be assessed qualitatively before a controlled pilot. It does not represent a completed external pilot or proven improvement.
Workflow candidate
- Label: Recurring agentic bugfix workflow (synthetic)
- Recurrence: Several comparable runs per week on similar issue types
- Owner: One technical lead available to review reports and test one intervention
Evidence received
- Langfuse export for six recent runs (anonymized metadata summary)
- Observable repair loops and handoff summaries in trace spans
- No raw customer payloads or credentials included
Comparability assessment
Partial. PRE baseline runs are broadly comparable, but two POST candidates changed model temperature and issue scope, weakening direct PRE/POST alignment until capture is tightened.
Signal coverage
- Repair pressure and skipped-step signals: present in most runs
- Handoff boundaries: partially captured; some tool calls lack stage labels
- Evidence strength for edge-level tuning: moderate, not definitive
Risks / limitations
- Small sample size for POST follow-up
- Intervention surface not yet frozen to one workflow edge
- Screening does not establish causal proof or guaranteed improvement
Readiness classification
Partial โ sufficient for a scoped PRE diagnostic and readiness conversation; insufficient for immediate POST/DELTA closure without refining capture and comparability.
Qualitative options used at screening: sufficient, partial, or insufficient. No numeric readiness score is assigned.
Recommended next step
Run a baseline PRE on the available traces, tighten handoff/checkpoint capture on the flagged edge, and retest one narrow intervention before expanding workflow scope.
Decision options: stabilize, retune, refine capture, or expand scope.