Illustrative screening example โ€” not client data

Readiness probe example

A screening step, not a full SDVM diagnosis

This example shows how a workflow might be assessed qualitatively before a controlled pilot. It does not represent a completed external pilot or proven improvement.

Workflow candidate

  • Label: Recurring agentic bugfix workflow (synthetic)
  • Recurrence: Several comparable runs per week on similar issue types
  • Owner: One technical lead available to review reports and test one intervention

Evidence received

  • Langfuse export for six recent runs (anonymized metadata summary)
  • Observable repair loops and handoff summaries in trace spans
  • No raw customer payloads or credentials included

Comparability assessment

Partial. PRE baseline runs are broadly comparable, but two POST candidates changed model temperature and issue scope, weakening direct PRE/POST alignment until capture is tightened.

Signal coverage

  • Repair pressure and skipped-step signals: present in most runs
  • Handoff boundaries: partially captured; some tool calls lack stage labels
  • Evidence strength for edge-level tuning: moderate, not definitive

Risks / limitations

  • Small sample size for POST follow-up
  • Intervention surface not yet frozen to one workflow edge
  • Screening does not establish causal proof or guaranteed improvement

Readiness classification

Partial โ€” sufficient for a scoped PRE diagnostic and readiness conversation; insufficient for immediate POST/DELTA closure without refining capture and comparability.

Qualitative options used at screening: sufficient, partial, or insufficient. No numeric readiness score is assigned.

Recommended next step

Run a baseline PRE on the available traces, tighten handoff/checkpoint capture on the flagged edge, and retest one narrow intervention before expanding workflow scope.

Decision options: stabilize, retune, refine capture, or expand scope.

Review the pilot intake template ยท Back to pilot overview