Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The GenAI (Generative AI) Prompt skill executes a chat completion request against a large language model (LLM) deployed in Azure OpenAI in Foundry Models or Microsoft Foundry. Use this skill to create new information that can be indexed and stored as searchable content.
Here are some examples of how the GenAI prompt skill can help you create content:
- Verbalize images
- Summarize large passages of text
- Simplify complex content
- Perform any other task that you can articulate in a prompt
The GenAI Prompt skill is generally available in the 2026-04-01 Search Service REST API and in Azure SDKs that target this version. This skill supports text, image, and multimodal content, such as images with visuals and text extracted from PDF files.
Tip
It's common to combine this skill with a data chunking skill. The Multimodal tutorial demonstrates image verbalization with two different data chunking strategies.
Supported models
You can use any chat completion inference model deployed in Foundry, such as GPT models, Deepseek R#, Llama-4-Mavericj, and Cohere-command-r. For GPT models specifically, only the chat completions API endpoints are supported. Endpoints using the Azure OpenAI Responses API (containing
/openai/responsesin the URI) aren't currently compatible.For image verbalization, the model you use to analyze the image determines what image formats are supported.
For GPT-5 models, the
temperatureparameter is not supported in the same way as previous models. If defined, it must be set to1.0, as other values will result in errors.Billing is based on the pricing of the model you use.
Note
The search service connects to your model over a public endpoint, so there are no region location requirements. However, if you're using an all-up Azure solution, you should check the Azure AI Search regions and the Azure OpenAI model regions to find suitable pairs, especially if you have data residency requirements.
Prerequisites
An Azure OpenAI in Foundry Models resource or Foundry project.
A supported model deployed to your resource or project.
For Azure OpenAI, copy the endpoint with the
openai.azure.comdomain from the Keys and Endpoint page in the Azure portal. Use this endpoint for theUriparameter in this skill.For Foundry, copy the target URI for the deployment from the Models page in the Foundry portal. Use this endpoint for the
Uriparameter in this skill.
Authentication can be key-based with an API key from your Foundry or Azure OpenAI resource. However, we recommend role-based access using a search service managed identity assigned to a role.
On Azure OpenAI, assign Cognitive Services OpenAI User to the managed identity.
On Foundry, assign Foundry User to the managed identity.
Important
The Foundry RBAC roles were recently renamed. Foundry User, Foundry Owner, Foundry Account Owner, and Foundry Project Manager were previously named Azure AI User, Azure AI Owner, Azure AI Account Owner, and Azure AI Project Manager. You might still see the previous names in some places while the rename rolls out. The role IDs and core permissions are unchanged by the rename.
@odata.type
#Microsoft.Skills.Custom.ChatCompletionSkill
Data limits
| Limit | Notes |
|---|---|
maxTokens |
Default is 1024 if omitted. Maximum value is model-dependent. |
| Request time-out | Fixed at 30 seconds. Consider this limit when you choose a model for bulk indexing, as reasoning models (such as o1 and o3) might exceed it. |
| Images | Base 64–encoded images and image URLs are supported. Size limit is model-dependent. |
Skill parameters
| Property | Type | Required | Notes |
|---|---|---|---|
uri |
string | Yes | Public endpoint of the deployed model. Supported domains are:
|
apiKey |
string | Cond.* | Secret key for the model. Leave blank when using managed identity. |
authIdentity |
string | Cond.* | User-assigned managed identity client ID (Azure OpenAI only). Leave blank to use the system-assigned identity. |
commonModelParameters |
object | No | Standard generation controls such as temperature, maxTokens, etc. |
extraParameters |
object | No | Open dictionary passed through to the underlying model API. |
extraParametersBehavior |
string | No | "pass-through" | "drop" | "error" (default "error"). |
responseFormat |
object | No | Controls whether the model returns text, a free-form JSON object, or a strongly typed JSON schema. responseFormat payload examples: {responseFormat: { type: text }}, {responseFormat: { type: json_object }}, {responseFormat: { type: json_schema }} |
* Exactly one of apiKey, authIdentity, or the service’s system-assigned identity must be used.
commonModelParameters defaults
| Parameter | Default |
|---|---|
model |
(deployment default) |
frequencyPenalty |
0 |
presencePenalty |
0 |
maxTokens |
1024 |
temperature |
0.7 |
seed |
null |
stop |
null |
Skill inputs
| Input name | Type | Required | Description |
|---|---|---|---|
systemMessage |
string | Yes | System-level instruction (ex: "You are a helpful assistant."). |
userMessage |
string | Yes | User prompt. |
text |
string | No | Optional text appended to userMessage (text-only scenarios). |
image |
string (Base 64 data-URL) | No | Adds an image to the prompt (multimodal models only). |
imageDetail |
string (low | high | auto) |
No | Fidelity hint for Azure OpenAI multimodal models. |
Skill outputs
| Output name | Type | Description |
|---|---|---|
response |
string or JSON object | Model output in the format requested by responseFormat.type. |
usageInformation |
JSON object | Token counts and echo of model parameters. |
Sample definitions
Text-only summarization
{
"@odata.type": "#Microsoft.Skills.Custom.ChatCompletionSkill",
"name": "Summarizer",
"description": "Summarizes document content.",
"context": "/document",
"inputs": [
{ "name": "text", "source": "/document/content" },
{ "name": "systemMessage", "source": "='You are a concise AI assistant.'" },
{ "name": "userMessage", "source": "='Summarize the following text:'" }
],
"outputs": [ { "name": "response" } ],
"uri": "https://demo.openai.azure.com/openai/deployments/gpt-4o/chat/completions",
"apiKey": "<api-key>",
"commonModelParameters": { "temperature": 0.3 }
}
Text + image description
{
"@odata.type": "#Microsoft.Skills.Custom.ChatCompletionSkill",
"name": "Image Describer",
"context": "/document/normalized_images/*",
"inputs": [
{ "name": "image", "source": "/document/normalized_images/*/data" },
{ "name": "imageDetail", "source": "=high" },
{ "name": "systemMessage", "source": "='You are a useful AI assistant.'" },
{ "name": "userMessage", "source": "='Describe this image:'" }
],
"outputs": [ { "name": "response" } ],
"uri": "https://demo.openai.azure.com/openai/deployments/gpt-4o/chat/completions",
"authIdentity": "11111111-2222-3333-4444-555555555555",
"responseFormat": { "type": "text" }
}
Structured numerical fact-finder
{
"@odata.type": "#Microsoft.Skills.Custom.ChatCompletionSkill",
"name": "NumericalFactFinder",
"context": "/document",
"inputs": [
{ "name": "systemMessage", "source": "='You are an AI assistant that helps people find information.'" },
{ "name": "userMessage", "source": "='Find all the numerical data and put it in the specified fact format.'"},
{ "name": "text", "source": "/document/content" }
],
"outputs": [ { "name": "response" } ],
"uri": "https://demo.openai.azure.com/openai/deployments/gpt-4o/chat/completions",
"apiKey": "<api-key>",
"responseFormat": {
"type": "json_schema",
"jsonSchemaProperties": {
"name": "NumericalFactObj",
"strict": true,
"schema": {
"type": "object",
"properties": "{\"facts\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"number\":{\"type\":\"number\"},\"fact\":{\"type\":\"string\"}},\"required\":[\"number\",\"fact\"]}}}",
"required": [ "facts" ],
"additionalProperties": false
}
}
}
}
Sample output (truncated)
{
"response": {
"facts": [
{ "number": 32.0, "fact": "Jordan scored 32 points per game in 1986-87." },
{ "number": 6.0, "fact": "He won 6 NBA championships." }
]
},
"usageInformation": {
"usage": {
"completion_tokens": 203,
"prompt_tokens": 248,
"total_tokens": 451
}
}
}
Best practices
- Chunk long documents with the Text Split skill to stay within the model’s context window.
- For high-volume indexing, dedicate a separate model deployment to this skill so that token quotas for query-time RAG workloads remain unaffected.
- To minimize latency, co-locate the model and your Azure AI Search service in the same Azure region.
- Use
responseFormat.json_schemawith GPT-4o for reliable structured extraction and easier mapping to index fields. - Monitor token usage and submit quota-increase requests if the indexer saturates your Tokens per Minute (TPM) limits.
Errors and warnings
| Condition | Result |
|---|---|
Missing or invalid uri |
Error |
| No authentication method specified | Error |
Both apiKey and authIdentity supplied |
Error |
| Unsupported model for multimodal prompt | Error |
| Input exceeds model token limit | Error |
Model returns invalid JSON for json_schema |
Warning – raw string returned in response |