Skip to main content
Auto-injection is how SovaraDB becomes useful during a run. When your agent makes a model call inside a Sovara run, Sovara can retrieve relevant priors and insert them into the prompt as structured context. The default path is automatic. You create priors once, keep the desktop app running, and Sovara handles retrieval during future runs.

Why auto-injection is useful

The useful prior is often not known at the start of a run. A financial-analysis agent might begin with a broad question, retrieve filings, identify a liquidity subtask, and only then need the quick-ratio prior. Auto-injection lets Sovara consider priors close to the step where they matter. That keeps the guidance timely and avoids forcing the agent to carry every lesson from the beginning. Sovara run view showing priors injected into a model call Sovara run view showing priors injected into a model call

Why not inject everything at the top?

Top-loading all guidance is tempting, but it breaks down quickly:
  • The relevant context can change during a run
  • Early guidance can be stale by the time the agent reaches a later step
  • Large prompt blocks dilute attention
  • Unrelated priors can push the agent toward the wrong behavior
Sovara retrieves priors at runtime so the injected context can match the current step instead of only the original user request.

Why not always inject?

Even relevant priors have a cost. Every injected token competes with task context, retrieved evidence, tool output, and the model’s own reasoning budget. Sovara avoids injecting when it should not. It also uses prefix-caching so stable prior context does not need to be retrieved and inserted repeatedly for the same effective prompt state.

Why not let the LLM decide?

An LLM can help reason about context, but it is not the right control point for every injection decision. The model may not know which prior is needed until after it has already missed the lesson. Sovara keeps more control by evaluating possible injection at the runtime step. That gives the system a chance to apply the right domain lesson before the model answers.

Configure injection

Open Settings and go to the project’s SovaraDB section.
  • Turn on Disable prior injection when a project should run without runtime priors.
  • Turn on Low-latency prior injection for latency-critical applications.
Low-latency mode trades some retrieval depth for speed. Use it when the agent is in a tight interactive loop and small latency changes matter. SovaraDB project settings for disabling prior injection and enabling low-latency mode SovaraDB project settings for disabling prior injection and enabling low-latency mode

Manual injection

Automatic injection is the default, but you can manually retrieve priors when you need explicit control. Manual injection must happen inside an active Sovara run. When Sovara detects a managed manual <sovara-priors> block for a call, it skips automatic retrieval for that same call.
from openai import OpenAI
import sovara
from sovara.runner.priors import inject_priors

client = OpenAI()

def answer(question: str) -> str:
    with sovara.run("financial-analysis"):
        priors_context = inject_priors(
            context={"question": question},
            method="retrieve",
        )
        prompt = f"{priors_context}\n\nQuestion: {question}" if priors_context else question

        response = client.responses.create(
            model="gpt-5.4-mini",
            input=prompt,
        )
        return response.output_text
Use manual injection sparingly. It is best for custom orchestration, scoped experiments, or paths where you know exactly which retrieval context should be used.