Recommendation algorithm

Sovara’s recommendation algorithm keeps annotation focused. It does not try to replace the reviewer. It tries to decide which runs are worth a reviewer’s time.

What it optimizes for

The queue favors runs that may add new information. A run is more useful when it shows behavior not already covered by nearby annotated examples, exposes a partial gap in the agent’s capability, or contains a failure that is hard to judge from the final output alone. The algorithm also reduces repeat work. Runs that look similar to already reviewed successes should not keep coming back unless they reveal a new pattern.

Priority labels

Sovara groups surfaced runs by why they deserve attention. The UI shows five labels:

Repeated failure: the run matches a failure pattern already seen in related traces.
Failure risk: the run is likely wrong, but it is not yet established as a repeated failure pattern.
Novel behavior: the run is meaningfully different from reviewed examples.
Partially covered: related examples exist, but the run still tests a gap.
Covered: successful references already cover the behavior, so review is optional and lower priority.

The label is a starting point, not a verdict. The reviewer still decides whether the run should be marked as success or failure.

Reviewer guidance

When Sovara surfaces a run, it includes a short explanation and links to the trace steps worth checking first. Start there, then open the full run when the case needs more context.

Inspect run button in the annotation queue

Click Inspect run, then open Run chat and start with:

What should I look at first?

Run chat can point you to the final answer, retrieved evidence, related failed behavior, and the steps cited by the recommendation. Use that as the starting point for the label and ground truth. Good annotations make future recommendations better. They tell Sovara which behaviors are already covered, which failures matter, and which domain lessons should become lessons.

Getting started

Sovara

Observability

Annotations

Lessons

Recommendation algorithm

What it optimizes for

Priority labels

Reviewer guidance

​What it optimizes for

​Priority labels

​Reviewer guidance

What it optimizes for

Priority labels

Reviewer guidance