Azure AI Foundry moderation_blocked for safe child images

Question

Azure AI Foundry moderation_blocked for safe child images

Andres Hurtado 0

We are experiencing inconsistent moderation behavior when using gpt-image-2 in Azure AI Foundry. Our application generates stylized watercolor children's book characters from uploaded reference images. Some requests fail with: "code": "moderation_blocked" However, retrying the exact same request with the same image and prompt sometimes succeeds. This behavior appears non-deterministic and is affecting production reliability. We would appreciate clarification on: - whether this is expected behavior - best practices to reduce false positives - whether any configuration options exist for this scenario

Anshika Varshney 10,655 Reputation points Microsoft External Staff Moderator

2026-05-13T03:24:15.6033333+00:00
Hey Andres,

It looks like you're seeing content filtering being triggered for requests that seem safe, especially with gpt-image-2. Let me clarify how this works and what you can check.

Is this expected

Yes, this can happen.

Azure AI content filtering is based on machine learning models that classify content into categories like violence, hate, sexual, and self harm. These models are probabilistic, which means the same input may sometimes pass and sometimes get blocked.

So in some cases, even safe-looking content can be flagged depending on how the model interprets the input.

Why this happens

There are a few common reasons for this kind of behavior:

The model may interpret parts of the prompt or image differently each time

Certain visual patterns or wording may look ambiguous to the filter

The filtering threshold is set to block medium and high severity by default

Because of this, small changes in input can lead to different results.

What you can do to reduce false positives

You can try the following:

Check the content filter result returned in the response This helps identify which category triggered the block

Slightly adjust or simplify your prompt Avoid any wording that could be interpreted as sensitive or ambiguous

Try small changes to the image Cropping, resizing, or changing minor details can sometimes help

Test multiple requests This helps identify patterns and common triggers in your scenario

Configuration options

In Azure AI Foundry, content filters are applied by default to all models.

You can:

Review your current filter settings under Guardrails and controls

Adjust filtering level where allowed

For more advanced customization, some options may require special approval depending on your subscription and access

Useful references

Content filter behavior and categories

Configure content filters

Default safety policies

I Hope this helps. Do let me know if you have any further queries.

Thankyou!
Anshika Varshney 10,655 Reputation points Microsoft External Staff Moderator

2026-05-14T08:25:30.4+00:00

Hey Andres,

Did you get any chance to review the response.

Thankyou!

2 answers

Your answer

Anshika Varshney 10,655 Reputation points Microsoft External Staff Moderator

2026-05-14T08:25:30.4+00:00

Hey Andres,

Did you get any chance to review the response.

Thankyou!

Answer 1

kagiyama yutaka 2,590

I think moderation can shift on edge cases, so keeping prompts clearly fictional and unambiguous is the only stable way; there’s no user‑side relaxation, and public docs note filters may still flag context‑dependent edge cases.　

0 comments

Answer 2

This behavior is expected in the sense that Azure AI Foundry image generation and moderation pipelines include multiple safety and content filtering components, and these can sometimes lead to non-deterministic blocking outcomes for borderline content.

Key points from the documented behavior:

Built-in Responsible AI and moderation

Azure OpenAI image generation models (including gpt-image-1 and successors) have built-in Responsible AI protections and both input and output moderation.
Azure-specific safeguards such as content filtering and abuse monitoring run on every request and can block prompts or generated images that are assessed as harmful or policy-violating.
The system can block either:
- The prompt (input) before generation, or
- The generated image (output) after generation.
When content filtering triggers, the operation status is set to Failed and an error code such as contentFilter (or, in your case, moderation_blocked) is returned.

Why behavior can appear non-deterministic

Content filters and safety systems operate on probabilistic models and multiple signals. For borderline cases (for example, stylized or child-like characters), small internal variations can cause some requests to be blocked while others pass.
The documentation explicitly notes that prompts or generated outputs can be blocked by the content filter, and that this is part of the safety system rather than a strict deterministic rule set.

Best practices to reduce false positives Based on the documented guidance around content filtering and troubleshooting image generation:

Refine prompts to be clearly compliant:
- Avoid ambiguous wording that could be misinterpreted as involving real minors in sensitive contexts.
- Emphasize non-photorealistic, clearly stylized art (for example, “stylized watercolor illustration”, “cartoon-style character”, “non-photorealistic children’s book illustration”).
- Use negative prompts to explicitly exclude unwanted elements (for example, “no realistic photography”, “no nudity”, “no violence”).
Check content filtering logs and guardrails:
- When prompts don’t produce an image and content filtering is suspected, review content filtering logs as recommended in the troubleshooting guidance.
- Use the Guardrails and controls guidance to align prompts with acceptable content categories and avoid patterns that are more likely to be blocked.
Increase prompt specificity:
- The troubleshooting guidance notes that low-quality or unexpected images often result from vague prompts; more specific prompts also help the moderation system better understand benign intent.

Configuration and customization options

Content filtering is enabled by default for image generation APIs.
The documentation notes that:
- Customers can learn more about content filtering and how to customize it via the content filtering documentation.
- There is a process to apply to opt out of content filtering in some scenarios.
For images of minors specifically:
- Photorealistic images of minors are blocked by default.
- Access to this capability requires a request process; enterprise-tier customers are automatically approved.
For your scenario (stylized children’s book characters), the main levers documented are:
- Prompt design and clarity.
- Reviewing and aligning with guardrails guidance.
- Requesting customization or opt-out of content filtering if eligible and appropriate for the use case.

Operational mitigations Within the documented constraints:

Implement robust error handling for moderation-related failures (such as contentFilter/moderation_blocked), including:
- Detecting these error codes and surfacing user-friendly messages.
- Optionally offering alternative prompts or styles that are less likely to be blocked.
Avoid blind automatic retries with the exact same prompt and image, since repeated failures indicate the safety system is consistently treating the content as borderline or non-compliant.

In summary, the moderation system’s probabilistic and multi-signal nature can lead to occasional non-deterministic blocking for borderline content. The recommended mitigations are to refine prompts to be clearly stylized and compliant, use guardrails and content filtering guidance, review logs when prompts are blocked, and, where appropriate, pursue documented customization or opt-out options.

References:

Share via

Azure AI Foundry moderation_blocked for safe child images

2 answers

Your answer