Pricing and billing for Azure SRE Agent

Learn how Azure SRE Agent billing works and what to expect on your Azure bill.

How billing works

Azure SRE Agent charges are based on Azure Agent Units (AAUs), a standardized measure of agentic processing used across all prebuilt Azure agents. Your monthly bill combines two types of charges.

Always-on flow (fixed cost)

When you create an agent, you pay a fixed rate as long as the agent exists:

Component	Rate
Always-on flow	4 AAUs per agent-hour

Always-on flow doesn't mean the agent is actively processing work. It represents the baseline cost of keeping your agent provisioned and available. Always-on billing continues from agent creation until the agent is deleted.

Active flow (variable cost)

Whenever your agent works - whether a user asks a question interactively, an automation triggers a task, or an async operation runs in the background - the agent consumes active flow AAUs. Any time the agent is actively processing counts as active flow, regardless of how the work was initiated.

How tokens become AAUs

Every time your agent works, it consumes LLM tokens. Each token type is metered separately at the rate shown in the following table.

Token type	What it measures
Input	Tokens sent to the model (prompts, tool results, context)
Output	Tokens generated by the model (responses, reasoning)
Cache read	Tokens served from prompt cache (repeated context)
Cache write	Tokens written to prompt cache for future reuse

Your total active flow AAUs for a task = sum of AAUs across all four token types.

AAU rates by model

Number of AAUs consumed per 1 million tokens:

Model	Input	Output	Cache read	Cache write
Claude Opus 4.6	100 AAUs	500 AAUs	10 AAUs	125 AAUs
GPT 5.3 Codex	35 AAUs	280 AAUs	3.5 AAUs	0 AAUs
GPT 5.2	35 AAUs	280 AAUs	3.5 AAUs	0 AAUs

Rates are per 1 million tokens.

Note

Azure might add more models and providers in the future. Azure sets AAU rates and might update them as new models are released.

Key details:

Only processing time counts. Time the agent spends waiting for your response isn't billed as active flow.
Active flow resets monthly. Your AAU consumption counter resets at the beginning of each calendar month.
Set provider at agent level. Configure the model provider (Anthropic, OpenAI, and others) in your agent's settings. The corresponding model determines your AAU rates.

Active flow by task type

The number of tokens you use - and the AAUs you pay for - depends on how complex the task is. More complex tasks need more LLM reasoning steps, tool calls, and data processing, so they use more tokens.

Here's how token use translates to AAUs for common scenarios:

Scenario	Input tokens	Output tokens	Cache read	Cache write	Claude Opus 4.6 AAUs	GPT 5.3 Codex AAUs	Example
Quick question	~20K	~2K	~15K	~5K	~3.8	~1.3	"Show me recent alerts"
Incident investigation	~200K	~15K	~150K	~50K	~35.3	~11.7	Automated incident from Azure Monitor
Full remediation	~500K	~40K	~400K	~100K	~86.5	~30.1	"Diagnose and fix the failing deployment"

How the math works (Claude Opus 4.6 example - quick question):

Token type	Tokens	Rate per 1M	AAUs
Input	20K	100	2.0
Output	2K	500	1.0
Cache read	15K	10	0.15
Cache write	5K	125	0.625
Total			3.775 AAUs

Tip

To keep active flow costs predictable, set a monthly AAU allocation limit in Settings > Agent consumption.

Monitor your costs

In the SRE Agent portal

Go to Settings > Agent consumption to view your usage:

Monthly AAU limit: your combined always-on and active flow allocation with a button to adjust it
Total active flow consumption: donut chart breaking down usage by thread type (Chats, Incidents, Scheduled tasks, Triggers)
Daily active flow consumption: stacked bar chart showing AAU usage per day, color-coded by type
Consumption by thread: table listing every thread with its AAU cost, type, and status

For a full walkthrough, see Monitor agent usage.

Set an active flow spending limit

Select Change AAU allocation to set a monthly active flow AAU limit (minimum 500, maximum 1,000,000 AAUs). This limit applies to active flow only - always-on billing continues as long as the agent exists.

When your agent reaches the active flow limit, it becomes unavailable for chat and actions until the next month. Always-on charges continue for the rest of the month.
You can increase or decrease the allocation at any time.
Increases take effect immediately - if you raise the limit above current consumption, chat and actions resume right away.
Decreases take effect next month. Until then, the agent runs in always-on flow only.

Billing impact by action

Action	Active flow	Always-on	To resume next month
Set budget limit (hit limit)	Stops	Still billed	Resets automatically at start of month
Stop agent	Stops	Still billed	Manually select Start in Settings > Basics
Delete agent	Stops	Stops	Create a new agent

In Azure Cost Management

For detailed billing breakdowns across multiple agents and resources, use Azure Cost Management in the Azure portal.

Cost optimization tips

Strategy	Impact	How to do it
Add context to your agent	Fewer wasted tokens	Add skills, knowledge, and documents so the agent stays grounded and concise. Persistent memory from past interactions improves efficiency over time.
Filter incidents with response plans	Less unnecessary work	Use response plans to filter Azure Monitor alerts by severity, service, or keyword - the agent only investigates incidents that match.
Batch work with scheduled tasks	Fewer runs	Schedule tasks to run daily or weekly instead of polling continuously. See Scheduled tasks.
Test in chat before automating	Avoids wasted runs	Try your prompt in chat or the Playground first. A misconfigured automation runs repeatedly and wastes AAUs.
Stop idle agents	Eliminates active flow	Go to Settings > Basics and select Stop. The agent keeps its configuration but stops all active flow. Always-on cost continues until deleted.
Delete unused agents	Eliminates all costs	In sre.azure.com, open the agent and go to Settings > Basics > Delete agent. All billing stops immediately.

Frequently asked questions

How does the agent compute AAUs from tokens?

Every time your agent performs work, it tracks the LLM tokens consumed across all four token types and meters them at the AAU rates for your configured model. You can see your AAU consumption in Settings > Agent consumption.

Does the provider I choose affect my costs?

Set the model provider (Anthropic, OpenAI, and others) at the agent level. It determines which AAU rates apply. Different models have different rates. See the AAU rates table for current rates.

Which model should I choose?

Claude Opus 4.6 has higher AAU rates but typically produces more thorough investigations with fewer reasoning steps. For complex incident investigations and root cause analysis, Opus often reaches a conclusion in fewer tool calls, which can offset the higher per-token rate. GPT models are a good choice for simpler, high-volume tasks like scheduled compliance checks where cost efficiency matters more than depth. You can change your model provider at any time in Settings > Basics and compare results.

Do I get charged when the agent is waiting for me to respond?

No. Only the time the agent spends actively processing a task counts as active flow. If the agent asks for your approval and waits, that waiting time isn't billed.

What counts as active flow?

Any time the agent is actively doing work counts as active flow. This work includes:

Interactive prompts: a user asking the agent a question in chat
Automation: scheduled tasks, incident response plans, or other automated triggers
Async operations: background investigations, report generation, or remediation tasks

In all cases, the agent meters tokens consumed as AAUs.

What happens if I stop my agent?

A stopped agent can't monitor your resources or respond to prompts, but it still incurs the fixed always-on cost. Active flow AAUs aren't consumed while stopped. To stop your agent, go to Settings > Basics and select Stop. To resume, select Start from the same page. To stop all billing entirely, delete the agent.

Can one agent handle multiple workloads?

Yes. A single agent can monitor multiple resources within its configured scope. Consolidating workloads under one agent reduces always-on costs compared to deploying separate agents.

Is there a free tier?

No. Azure SRE Agent charges begin at agent creation. See the Azure pricing calculator for current rates.

Is pricing the same in all regions?

Check the Azure pricing calculator for current pricing in your region.

Feedback

Was this page helpful?

Last updated on 2026-05-14