Sproutice
All articles

Track AI Agent Token Usage & Cost in Java — Sprout 1.3.0

AI AgentsJavaLLM cost trackingToken usageObservabilitySpring bootSproutRelease

Track AI agent token usage and cost in Java with sprout-monitoring

If you build AI agents in Java, sooner or later you need to answer a money question: how many tokens are my agents and models burning, and what is it costing me? Sprout 1.3.0 ships a new module — sprout-monitoring — that tracks agent and model usage, token counts and cost, broken down per model, per agent and per tool. The best part: it plugs into your app in one line and your agents don't change at all.

TL;DR

  • New sprout-monitoring module on Maven Central (Sprout 1.3.0).
  • Tracks token usage, cost and call counts per model, per agent and per tool.
  • Built entirely on Sprout's event bus — zero overhead added to the agent loop.
  • A new @UsageStore component with an in-memory default; swap it for a database, Prometheus or StatsD.
  • Works automatically as a Spring bean under the Spring Boot starter.

What sprout-monitoring tracks

Add the module, run your agents, and read back a snapshot of everything they used:

Models:
  WeatherStubModel: 6 calls, 297 in + 60 out tokens, $0.001791
Agents:
  WeatherAgent: 3 runs (3 ok / 0 failed), 6 iterations, 357 tokens
Tools:
  forecast: 3 calls
Totals: 6 model calls, 357 tokens, $0.001791

That's the whole value proposition: always-on token and cost accounting, broken down the way you actually think about your system — by model, by agent, by tool, and as a global total. No dashboards to wire up before you can see the numbers.

How it works: built on Sprout's event bus

Here's the part we're proud of — monitoring adds no work to the agent loop. It is built entirely on the event bus that ships in Sprout's core.

As agents run, the loop already publishes a prefab set of lifecycle events: AgentStartedEvent / AgentCompletedEvent / AgentFailedEvent, ModelRequestEvent / ModelResponseEvent, and ToolCalledEvent. Monitoring is just a well-behaved subscriber. A small collector listens for the events that carry usage and folds each one into a store:

public void subscribe(AbstractEventBus bus) {
    bus.subscribe(ModelResponseEvent.class, this::onModelResponse);
    bus.subscribe(AgentCompletedEvent.class, this::onAgentCompleted);
    bus.subscribe(AgentFailedEvent.class, this::onAgentFailed);
    bus.subscribe(ToolCalledEvent.class, this::onToolCalled);
}

Because it consumes the same events whether a model runs inside an agent or standalone, everything is counted exactly once, and the agent code stays oblivious. This is observability as a side effect of events that were already flowing — no instrumentation, no wrappers, no hooks sprinkled through your business logic.

Pricing: turning tokens into cost

Token counts are exact; cost depends on your rates. You configure those per model with simple properties — sprout.monitoring.pricing.<modelName>.input / .output, each a price per one million tokens:

sprout.monitoring.pricing.AnthropicModelExecutor.input=3.0
sprout.monitoring.pricing.AnthropicModelExecutor.output=15.0

An unpriced model still has its tokens tracked, at zero cost, so you never lose data while you wait to fill in a price.

A new @UsageStore component, wired like every other

The data lands in a @UsageStore — a new component type that behaves exactly like the stores and models you already know in Sprout. It's wired by its own @Processor, the same extension point that powers @Model and @ConversationStore: the processor registers the store (under the name usageStore) and subscribes the collector to the bus. That's it.

The module ships an in-memory default, InMemoryUsageStore. Want to keep usage somewhere durable — a database, Prometheus, StatsD? Implement the interface, annotate it, and let it be scanned in your own package:

@UsageStore
public class PrometheusUsageStore implements AbstractUsageStore {
    // record model calls, agent runs and tool invocations however you like
}

Your bean is the store now, with nothing else changing — the same swap-the-component model as @EventBus and @ConversationStore. This is the design principle Sprout keeps returning to: you pick the behaviour you want by choosing a component, not by editing a framework.

How to add monitoring to your Sprout app (3 steps)

1. Add the dependency from Maven Central:

<dependency>
    <groupId>io.github.ivannavas</groupId>
    <artifactId>sprout-monitoring</artifactId>
    <version>1.3.0</version>
</dependency>

2. Put the in-memory store on your component scan (the same way the RAG example pulls in core's in-memory vector store):

sprout.scan.base-packages=com.example.app,io.github.ivannavas.sprout.monitoring.impl

3. Read the totals anywhere:

AbstractUsageStore store = container.getSingleton("usageStore");
UsageSnapshot usage = store.snapshot();

System.out.println(usage.modelCalls() + " calls, "
        + usage.totalTokens() + " tokens, $" + usage.totalCost());

usage.byAgent().get("WeatherAgent");            // runs (completed/failed), iterations, tokens
usage.byModel().get("AnthropicModelExecutor");  // calls, tokens, cost
usage.byTool().get("forecast");                 // call count

No agent changes. No glue code.

Using it with Spring Boot

Because the store is a managed singleton named usageStore, if you run Sprout inside Spring Boot through the starter, it's exposed as a Spring bean automatically. Inject AbstractUsageStore into a @RestController and serve live token and cost metrics from an endpoint — no extra configuration.

FAQ

How do I track LLM token usage in Java?

Add sprout-monitoring to a Sprout app and put its store package on your component scan. Every model call publishes a ModelResponseEvent carrying its TokenUsage, which the collector records — so token usage is tracked automatically, with no changes to your agents.

How do I calculate the cost of an AI agent?

Configure a per-model rate with sprout.monitoring.pricing.<model>.input/.output (price per one million tokens). Monitoring multiplies your token counts by those rates and exposes totals globally and per model.

Does it work with OpenAI and Anthropic models?

Yes. It works with any Sprout @Model, including sprout-openai and sprout-anthropic, because it observes the model events rather than any specific provider API.

Can I store usage in a database or Prometheus instead of memory?

Yes. Implement AbstractUsageStore, annotate it @UsageStore, and it replaces the in-memory default — pointing usage at a database, Prometheus, StatsD or anywhere else.

Does monitoring slow down my agents?

No. It only subscribes to events the agent loop already publishes; the default in-memory store does cheap in-process aggregation, so there's no extra model calls and no measurable overhead.

Get it

sprout-monitoring is live on Maven Central as part of Sprout 1.3.0, alongside the rest of the modules (core, anthropic, openai, mcp, orchestration, spring-boot-starter). There's a runnable example in sprout-examples (-Pmonitoring) that produces the report above, fully offline.

Build agents you can see — then you'll know exactly what they cost.

Let's talk