Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Learn how Azure SRE Agent billing works and what to expect on your Azure bill.
How billing works
Azure SRE Agent charges are based on Azure Agent Units (AAUs), a standardized measure of agentic processing used across all prebuilt Azure agents. Your monthly bill combines two types of charges.
Always-on flow (fixed cost)
When you create an agent, you pay a fixed rate as long as the agent exists:
| Component | Rate |
|---|---|
| Always-on flow | 4 AAUs per agent-hour |
Always-on flow doesn't mean the agent is actively processing work. It represents the baseline cost of keeping your agent provisioned and available. Always-on billing continues from agent creation until the agent is deleted.
Active flow (variable cost)
Whenever your agent works - whether a user asks a question interactively, an automation triggers a task, or an async operation runs in the background - the agent consumes active flow AAUs. Any time the agent is actively processing counts as active flow, regardless of how the work was initiated.
How tokens become AAUs
Every time your agent works, it consumes LLM tokens. Each token type is metered separately at the rate shown in the following table.
| Token type | What it measures |
|---|---|
| Input | Tokens sent to the model (prompts, tool results, context) |
| Output | Tokens generated by the model (responses, reasoning) |
| Cache read | Tokens served from prompt cache (repeated context) |
| Cache write | Tokens written to prompt cache for future reuse |
Your total active flow AAUs for a task = sum of AAUs across all four token types.
AAU rates by model
Number of AAUs consumed per 1 million tokens:
| Model | Input | Output | Cache read | Cache write |
|---|---|---|---|---|
| Claude Opus 4.6 | 100 AAUs | 500 AAUs | 10 AAUs | 125 AAUs |
| GPT 5.3 Codex | 35 AAUs | 280 AAUs | 3.5 AAUs | 0 AAUs |
| GPT 5.2 | 35 AAUs | 280 AAUs | 3.5 AAUs | 0 AAUs |
Rates are per 1 million tokens.
Note
Azure might add more models and providers in the future. Azure sets AAU rates and might update them as new models are released.
Key details:
- Only processing time counts. Time the agent spends waiting for your response isn't billed as active flow.
- Active flow resets monthly. Your AAU consumption counter resets at the beginning of each calendar month.
- Set provider at agent level. Configure the model provider (Anthropic, OpenAI, and others) in your agent's settings. The corresponding model determines your AAU rates.
Active flow by task type
The number of tokens you use - and the AAUs you pay for - depends on how complex the task is. More complex tasks need more LLM reasoning steps, tool calls, and data processing, so they use more tokens.
Here's how token use translates to AAUs for common scenarios:
| Scenario | Input tokens | Output tokens | Cache read | Cache write | Claude Opus 4.6 AAUs | GPT 5.3 Codex AAUs | Example |
|---|---|---|---|---|---|---|---|
| Quick question | ~20K | ~2K | ~15K | ~5K | ~3.8 | ~1.3 | "Show me recent alerts" |
| Incident investigation | ~200K | ~15K | ~150K | ~50K | ~35.3 | ~11.7 | Automated incident from Azure Monitor |
| Full remediation | ~500K | ~40K | ~400K | ~100K | ~86.5 | ~30.1 | "Diagnose and fix the failing deployment" |
How the math works (Claude Opus 4.6 example - quick question):
| Token type | Tokens | Rate per 1M | AAUs |
|---|---|---|---|
| Input | 20K | 100 | 2.0 |
| Output | 2K | 500 | 1.0 |
| Cache read | 15K | 10 | 0.15 |
| Cache write | 5K | 125 | 0.625 |
| Total | 3.775 AAUs |
Tip
To keep active flow costs predictable, set a monthly AAU allocation limit in Settings > Agent consumption.
Monitor your costs
In the SRE Agent portal
Go to Settings > Agent consumption to view your usage:
- Monthly AAU limit: your combined always-on and active flow allocation with a button to adjust it
- Total active flow consumption: donut chart breaking down usage by thread type (Chats, Incidents, Scheduled tasks, Triggers)
- Daily active flow consumption: stacked bar chart showing AAU usage per day, color-coded by type
- Consumption by thread: table listing every thread with its AAU cost, type, and status
For a full walkthrough, see Monitor agent usage.
Set an active flow spending limit
Select Change AAU allocation to set a monthly active flow AAU limit (minimum 500, maximum 1,000,000 AAUs). This limit applies to active flow only - always-on billing continues as long as the agent exists.
- When your agent reaches the active flow limit, it becomes unavailable for chat and actions until the next month. Always-on charges continue for the rest of the month.
- You can increase or decrease the allocation at any time.
- Increases take effect immediately - if you raise the limit above current consumption, chat and actions resume right away.
- Decreases take effect next month. Until then, the agent runs in always-on flow only.
Billing impact by action
| Action | Active flow | Always-on | To resume next month |
|---|---|---|---|
| Set budget limit (hit limit) | Stops | Still billed | Resets automatically at start of month |
| Stop agent | Stops | Still billed | Manually select Start in Settings > Basics |
| Delete agent | Stops | Stops | Create a new agent |
In Azure Cost Management
For detailed billing breakdowns across multiple agents and resources, use Azure Cost Management in the Azure portal.
Cost optimization tips
| Strategy | Impact | How to do it |
|---|---|---|
| Add context to your agent | Fewer wasted tokens | Add skills, knowledge, and documents so the agent stays grounded and concise. Persistent memory from past interactions improves efficiency over time. |
| Filter incidents with response plans | Less unnecessary work | Use response plans to filter Azure Monitor alerts by severity, service, or keyword - the agent only investigates incidents that match. |
| Batch work with scheduled tasks | Fewer runs | Schedule tasks to run daily or weekly instead of polling continuously. See Scheduled tasks. |
| Test in chat before automating | Avoids wasted runs | Try your prompt in chat or the Playground first. A misconfigured automation runs repeatedly and wastes AAUs. |
| Stop idle agents | Eliminates active flow | Go to Settings > Basics and select Stop. The agent keeps its configuration but stops all active flow. Always-on cost continues until deleted. |
| Delete unused agents | Eliminates all costs | In sre.azure.com, open the agent and go to Settings > Basics > Delete agent. All billing stops immediately. |
Frequently asked questions
How does the agent compute AAUs from tokens?
Every time your agent performs work, it tracks the LLM tokens consumed across all four token types and meters them at the AAU rates for your configured model. You can see your AAU consumption in Settings > Agent consumption.
Does the provider I choose affect my costs?
Set the model provider (Anthropic, OpenAI, and others) at the agent level. It determines which AAU rates apply. Different models have different rates. See the AAU rates table for current rates.
Which model should I choose?
Claude Opus 4.6 has higher AAU rates but typically produces more thorough investigations with fewer reasoning steps. For complex incident investigations and root cause analysis, Opus often reaches a conclusion in fewer tool calls, which can offset the higher per-token rate. GPT models are a good choice for simpler, high-volume tasks like scheduled compliance checks where cost efficiency matters more than depth. You can change your model provider at any time in Settings > Basics and compare results.
Do I get charged when the agent is waiting for me to respond?
No. Only the time the agent spends actively processing a task counts as active flow. If the agent asks for your approval and waits, that waiting time isn't billed.
What counts as active flow?
Any time the agent is actively doing work counts as active flow. This work includes:
- Interactive prompts: a user asking the agent a question in chat
- Automation: scheduled tasks, incident response plans, or other automated triggers
- Async operations: background investigations, report generation, or remediation tasks
In all cases, the agent meters tokens consumed as AAUs.
What happens if I stop my agent?
A stopped agent can't monitor your resources or respond to prompts, but it still incurs the fixed always-on cost. Active flow AAUs aren't consumed while stopped. To stop your agent, go to Settings > Basics and select Stop. To resume, select Start from the same page. To stop all billing entirely, delete the agent.
Can one agent handle multiple workloads?
Yes. A single agent can monitor multiple resources within its configured scope. Consolidating workloads under one agent reduces always-on costs compared to deploying separate agents.
Is there a free tier?
No. Azure SRE Agent charges begin at agent creation. See the Azure pricing calculator for current rates.
Is pricing the same in all regions?
Check the Azure pricing calculator for current pricing in your region.