Share via

Cosmos total request units compared to requests

Trevor Nichols 0 Reputation points
2026-05-06T03:12:40.58+00:00

We have a cosmos account that has extremely high total request units, which do not match the number of requests. The workload drops down to virtually zero over night yet it has been consistently high.

User's image

How can we identify what is causing this total request units so high. I have determined this is Query operations so filtered just for that - we have a lot of queries, sure, but this is a significant mismatch.

Longer term view shows this isn't always happening - it looks like this is correlating as I would expect just before the Tue 05 label

User's image

Azure Cosmos DB
Azure Cosmos DB

An Azure NoSQL database service for app development.


1 answer

Sort by: Most helpful
  1. Marcin Policht 89,325 Reputation points MVP Volunteer Moderator
    2026-05-06T03:24:03.49+00:00

    This usually points to inefficiencies in the requests, indexing issues, or hidden background operations. Since you’ve already filtered to query operations, the next step would be to dig into the requests and how they’re consuming RUs.

    One common cause is queries that scan large amounts of data. Cosmos DB charges RUs based on the work done, not just the number of requests, so a single inefficient query can consume hundreds or thousands of RUs. You should check for queries without proper filters, missing partition keys, or using cross-partition queries unnecessarily.

    To investigate further, you can look at the Query Metrics per operation. If you’re using the SDK or the portal, enable x-ms-documentdb-query-metrics for your queries. This will show you metrics such as retrieved document count, scanned document count, and RUs consumed per query. If you see that the retrievedDocumentCount is much lower than totalDocumentCount, that indicates a lot of documents are being scanned unnecessarily.

    You can also use the Metrics in Azure Monitor for Cosmos DB to break down RU consumption by operation type, container, and even time windows. Look specifically at the RU per query and partition. Sometimes a single partition can get hot, causing spikes that aren’t apparent if you only look at the number of requests.

    Finally, examine if there are any automated processes running around those spikes like batch jobs, change feed processors, or TTL cleanup tasks that might be executing queries inefficiently. These can run even when your normal workload is low and inflate RU usage unexpectedly.


    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.