Edit

Pricing and billing for Azure SRE Agent

Learn how Azure SRE Agent billing works and what to expect on your Azure bill.

How billing works

Azure SRE Agent charges are based on Azure Agent Units (AAUs), a standardized measure of agentic processing used across all prebuilt Azure agents. Your monthly bill combines two types of charges.

Always-on flow (fixed cost)

When you create an agent, you pay a fixed rate as long as the agent exists:

Component Rate
Always-on flow 4 AAUs per agent-hour

Always-on flow doesn't mean the agent is actively processing work. It represents the baseline cost of keeping your agent provisioned and available. Always-on billing continues from agent creation until the agent is deleted.

Active flow (variable cost)

Whenever your agent works - whether a user asks a question interactively, an automation triggers a task, or an async operation runs in the background - the agent consumes active flow AAUs. Any time the agent is actively processing counts as active flow, regardless of how the work was initiated.

How tokens become AAUs

Every time your agent works, it consumes LLM tokens. Each token type is metered separately at the rate shown in the following table.

Token type What it measures
Input Tokens sent to the model (prompts, tool results, context)
Output Tokens generated by the model (responses, reasoning)
Cache read Tokens served from prompt cache (repeated context)
Cache write Tokens written to prompt cache for future reuse

Your total active flow AAUs for a task = sum of AAUs across all four token types.

AAU rates by model

Number of AAUs consumed per 1 million tokens:

Model Input Output Cache read Cache write
Claude Opus 4.6 100 AAUs 500 AAUs 10 AAUs 125 AAUs
GPT 5.3 Codex 35 AAUs 280 AAUs 3.5 AAUs 0 AAUs
GPT 5.2 35 AAUs 280 AAUs 3.5 AAUs 0 AAUs

Rates are per 1 million tokens.

Note

Azure might add more models and providers in the future. Azure sets AAU rates and might update them as new models are released.

Key details:

  • Only processing time counts. Time the agent spends waiting for your response isn't billed as active flow.
  • Active flow resets monthly. Your AAU consumption counter resets at the beginning of each calendar month.
  • Set provider at agent level. Configure the model provider (Anthropic, OpenAI, and others) in your agent's settings. The corresponding model determines your AAU rates.

Active flow by task type

The number of tokens you use - and the AAUs you pay for - depends on how complex the task is. More complex tasks need more LLM reasoning steps, tool calls, and data processing, so they use more tokens.

Here's how token use translates to AAUs for common scenarios:

Scenario Input tokens Output tokens Cache read Cache write Claude Opus 4.6 AAUs GPT 5.3 Codex AAUs Example
Quick question ~20K ~2K ~15K ~5K ~3.8 ~1.3 "Show me recent alerts"
Incident investigation ~200K ~15K ~150K ~50K ~35.3 ~11.7 Automated incident from Azure Monitor
Full remediation ~500K ~40K ~400K ~100K ~86.5 ~30.1 "Diagnose and fix the failing deployment"

How the math works (Claude Opus 4.6 example - quick question):

Token type Tokens Rate per 1M AAUs
Input 20K 100 2.0
Output 2K 500 1.0
Cache read 15K 10 0.15
Cache write 5K 125 0.625
Total 3.775 AAUs

Tip

To keep active flow costs predictable, set a monthly AAU allocation limit in Settings > Agent consumption.

Monitor your costs

In the SRE Agent portal

Go to Settings > Agent consumption to view your usage:

  • Monthly AAU limit: your combined always-on and active flow allocation with a button to adjust it
  • Total active flow consumption: donut chart breaking down usage by thread type (Chats, Incidents, Scheduled tasks, Triggers)
  • Daily active flow consumption: stacked bar chart showing AAU usage per day, color-coded by type
  • Consumption by thread: table listing every thread with its AAU cost, type, and status

For a full walkthrough, see Monitor agent usage.

Set an active flow spending limit

Select Change AAU allocation to set a monthly active flow AAU limit (minimum 500, maximum 1,000,000 AAUs). This limit applies to active flow only - always-on billing continues as long as the agent exists.

  • When your agent reaches the active flow limit, it becomes unavailable for chat and actions until the next month. Always-on charges continue for the rest of the month.
  • You can increase or decrease the allocation at any time.
  • Increases take effect immediately - if you raise the limit above current consumption, chat and actions resume right away.
  • Decreases take effect next month. Until then, the agent runs in always-on flow only.

Billing impact by action

Action Active flow Always-on To resume next month
Set budget limit (hit limit) Stops Still billed Resets automatically at start of month
Stop agent Stops Still billed Manually select Start in Settings > Basics
Delete agent Stops Stops Create a new agent

In Azure Cost Management

For detailed billing breakdowns across multiple agents and resources, use Azure Cost Management in the Azure portal.

Cost optimization tips

Strategy Impact How to do it
Add context to your agent Fewer wasted tokens Add skills, knowledge, and documents so the agent stays grounded and concise. Persistent memory from past interactions improves efficiency over time.
Filter incidents with response plans Less unnecessary work Use response plans to filter Azure Monitor alerts by severity, service, or keyword - the agent only investigates incidents that match.
Batch work with scheduled tasks Fewer runs Schedule tasks to run daily or weekly instead of polling continuously. See Scheduled tasks.
Test in chat before automating Avoids wasted runs Try your prompt in chat or the Playground first. A misconfigured automation runs repeatedly and wastes AAUs.
Stop idle agents Eliminates active flow Go to Settings > Basics and select Stop. The agent keeps its configuration but stops all active flow. Always-on cost continues until deleted.
Delete unused agents Eliminates all costs In sre.azure.com, open the agent and go to Settings > Basics > Delete agent. All billing stops immediately.

Frequently asked questions

How does the agent compute AAUs from tokens?

Every time your agent performs work, it tracks the LLM tokens consumed across all four token types and meters them at the AAU rates for your configured model. You can see your AAU consumption in Settings > Agent consumption.

Does the provider I choose affect my costs?

Set the model provider (Anthropic, OpenAI, and others) at the agent level. It determines which AAU rates apply. Different models have different rates. See the AAU rates table for current rates.

Which model should I choose?

Claude Opus 4.6 has higher AAU rates but typically produces more thorough investigations with fewer reasoning steps. For complex incident investigations and root cause analysis, Opus often reaches a conclusion in fewer tool calls, which can offset the higher per-token rate. GPT models are a good choice for simpler, high-volume tasks like scheduled compliance checks where cost efficiency matters more than depth. You can change your model provider at any time in Settings > Basics and compare results.

Do I get charged when the agent is waiting for me to respond?

No. Only the time the agent spends actively processing a task counts as active flow. If the agent asks for your approval and waits, that waiting time isn't billed.

What counts as active flow?

Any time the agent is actively doing work counts as active flow. This work includes:

  • Interactive prompts: a user asking the agent a question in chat
  • Automation: scheduled tasks, incident response plans, or other automated triggers
  • Async operations: background investigations, report generation, or remediation tasks

In all cases, the agent meters tokens consumed as AAUs.

What happens if I stop my agent?

A stopped agent can't monitor your resources or respond to prompts, but it still incurs the fixed always-on cost. Active flow AAUs aren't consumed while stopped. To stop your agent, go to Settings > Basics and select Stop. To resume, select Start from the same page. To stop all billing entirely, delete the agent.

Can one agent handle multiple workloads?

Yes. A single agent can monitor multiple resources within its configured scope. Consolidating workloads under one agent reduces always-on costs compared to deploying separate agents.

Is there a free tier?

No. Azure SRE Agent charges begin at agent creation. See the Azure pricing calculator for current rates.

Is pricing the same in all regions?

Check the Azure pricing calculator for current pricing in your region.