Share via

Azure AI Foundry moderation_blocked for safe child images

Andres Hurtado 0 Reputation points
2026-05-07T21:46:32.7733333+00:00

We are experiencing inconsistent moderation behavior when using gpt-image-2 in Azure AI Foundry. Our application generates stylized watercolor children's book characters from uploaded reference images. Some requests fail with: "code": "moderation_blocked" However, retrying the exact same request with the same image and prompt sometimes succeeds. This behavior appears non-deterministic and is affecting production reliability. We would appreciate clarification on: - whether this is expected behavior - best practices to reduce false positives - whether any configuration options exist for this scenario

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform


2 answers

Sort by: Most helpful
  1. kagiyama yutaka 2,590 Reputation points
    2026-05-12T14:51:17.12+00:00

    I think moderation can shift on edge cases, so keeping prompts clearly fictional and unambiguous is the only stable way; there’s no user‑side relaxation, and public docs note filters may still flag context‑dependent edge cases. 

    Was this answer helpful?

    0 comments No comments

  2. Q&A Assist
    2026-05-07T21:47:00.8333333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    This behavior is expected in the sense that Azure AI Foundry image generation and moderation pipelines include multiple safety and content filtering components, and these can sometimes lead to non-deterministic blocking outcomes for borderline content.

    Key points from the documented behavior:

    1. Built-in Responsible AI and moderation
    • Azure OpenAI image generation models (including gpt-image-1 and successors) have built-in Responsible AI protections and both input and output moderation.
    • Azure-specific safeguards such as content filtering and abuse monitoring run on every request and can block prompts or generated images that are assessed as harmful or policy-violating.
    • The system can block either:
      • The prompt (input) before generation, or
      • The generated image (output) after generation.
    • When content filtering triggers, the operation status is set to Failed and an error code such as contentFilter (or, in your case, moderation_blocked) is returned.
    1. Why behavior can appear non-deterministic
    • Content filters and safety systems operate on probabilistic models and multiple signals. For borderline cases (for example, stylized or child-like characters), small internal variations can cause some requests to be blocked while others pass.
    • The documentation explicitly notes that prompts or generated outputs can be blocked by the content filter, and that this is part of the safety system rather than a strict deterministic rule set.
    1. Best practices to reduce false positives Based on the documented guidance around content filtering and troubleshooting image generation:
    • Refine prompts to be clearly compliant:
      • Avoid ambiguous wording that could be misinterpreted as involving real minors in sensitive contexts.
      • Emphasize non-photorealistic, clearly stylized art (for example, “stylized watercolor illustration”, “cartoon-style character”, “non-photorealistic children’s book illustration”).
      • Use negative prompts to explicitly exclude unwanted elements (for example, “no realistic photography”, “no nudity”, “no violence”).
    • Check content filtering logs and guardrails:
      • When prompts don’t produce an image and content filtering is suspected, review content filtering logs as recommended in the troubleshooting guidance.
      • Use the Guardrails and controls guidance to align prompts with acceptable content categories and avoid patterns that are more likely to be blocked.
    • Increase prompt specificity:
      • The troubleshooting guidance notes that low-quality or unexpected images often result from vague prompts; more specific prompts also help the moderation system better understand benign intent.
    1. Configuration and customization options
    • Content filtering is enabled by default for image generation APIs.
    • The documentation notes that:
      • Customers can learn more about content filtering and how to customize it via the content filtering documentation.
      • There is a process to apply to opt out of content filtering in some scenarios.
    • For images of minors specifically:
      • Photorealistic images of minors are blocked by default.
      • Access to this capability requires a request process; enterprise-tier customers are automatically approved.
    • For your scenario (stylized children’s book characters), the main levers documented are:
      • Prompt design and clarity.
      • Reviewing and aligning with guardrails guidance.
      • Requesting customization or opt-out of content filtering if eligible and appropriate for the use case.
    1. Operational mitigations Within the documented constraints:
    • Implement robust error handling for moderation-related failures (such as contentFilter/moderation_blocked), including:
      • Detecting these error codes and surfacing user-friendly messages.
      • Optionally offering alternative prompts or styles that are less likely to be blocked.
    • Avoid blind automatic retries with the exact same prompt and image, since repeated failures indicate the safety system is consistently treating the content as borderline or non-compliant.

    In summary, the moderation system’s probabilistic and multi-signal nature can lead to occasional non-deterministic blocking for borderline content. The recommended mitigations are to refine prompts to be clearly stylized and compliant, use guardrails and content filtering guidance, review logs when prompts are blocked, and, where appropriate, pursue documented customization or opt-out options.


    References:

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.