Learn how to generate embeddings

An embedding is a special format of data representation that machine learning models and algorithms can easily use. An embedding is an information-dense representation of the semantic meaning of a piece of text. Each embedding is a vector of floating-point numbers, such that the distance between two embeddings in the vector space correlates with the semantic similarity between two inputs in the original format. For example, if two texts are similar, their vector representations are also similar. Embeddings power vector similarity search in Azure databases such as Azure Cosmos DB for NoSQL, Azure SQL Database, and Azure Database for PostgreSQL - Flexible Server.

Prerequisites

An Azure OpenAI embedding model deployed.
The following values from your resource:
- Endpoint, for example, https://YOUR-RESOURCE-NAME.openai.azure.com/.
- API key.
- Model deployment name.

For more language-specific setup guidance, see Azure OpenAI supported programming languages.

How to get embeddings

To get an embedding vector for a piece of text, make a request to the embeddings endpoint as shown in the following code snippets:

Note

The Azure OpenAI embeddings API doesn't currently support Microsoft Entra ID with the v1 API. Use API key authentication for the examples in this article.

using OpenAI;
using OpenAI.Embeddings;
using System.ClientModel;

EmbeddingClient client = new(
    "text-embedding-3-small",
    credential: new ApiKeyCredential("API-KEY"),
    options: new OpenAIClientOptions()
    {

        Endpoint = new Uri("https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1")
    }
);

string input = "This is a test";

OpenAIEmbedding embedding = client.GenerateEmbedding(input);
ReadOnlyMemory<float> vector = embedding.ToFloats();
Console.WriteLine($"Embeddings: [{string.Join(", ", vector.ToArray())}]");

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/openai/openai-go/v2"
	"github.com/openai/openai-go/v2/option"
)

func main() {
	// Get API key from environment variable
	apiKey := os.Getenv("AZURE_OPENAI_API_KEY")
	if apiKey == "" {
		panic("AZURE_OPENAI_API_KEY environment variable is not set")
	}

	// Create a client with Azure OpenAI endpoint and API key
	client := openai.NewClient(
		option.WithBaseURL("https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/"),
		option.WithAPIKey(apiKey),
	)

	ctx := context.Background()
	text := "The attention mechanism revolutionized natural language processing"

	// Make an embedding request
	embedding, err := client.Embeddings.New(ctx, openai.EmbeddingNewParams{
		Input: openai.EmbeddingNewParamsInputUnion{OfString: openai.String(text)},
		Model: "text-embedding-3-large", // Use your deployed model name on Azure
	})
	if err != nil {
		panic(err.Error())
	}

	// Print embedding information
	fmt.Printf("Model: %s\n", embedding.Model)
	fmt.Printf("Number of embeddings: %d\n", len(embedding.Data))
	fmt.Printf("Embedding dimensions: %d\n", len(embedding.Data[0].Embedding))
	fmt.Printf("Usage - Prompt tokens: %d, Total tokens: %d\n", embedding.Usage.PromptTokens, embedding.Usage.TotalTokens)
	
	// Print first few values of the embedding vector
	fmt.Printf("First 10 embedding values: %v\n", embedding.Data[0].Embedding[:10])
}

import OpenAI from "openai";
const client = new OpenAI({
    baseURL: "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
    apiKey: process.env['OPENAI_API_KEY'] //Your Azure OpenAI API key
});

const embedding = await client.embeddings.create({
  model: "text-embedding-3-small",
  input: "Your text string goes here",
});

console.log(embedding);

import os
from openai import OpenAI

client = OpenAI(
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
  base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/"
)

response = client.embeddings.create(
    input = "Your text string goes here",
    model= "text-embedding-3-large"
)

print(response.model_dump_json(indent=2))

# Azure OpenAI metadata variables
$openai = @{
    api_key     = $Env:AZURE_OPENAI_API_KEY
    api_base    = $Env:AZURE_OPENAI_ENDPOINT # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
    name        = 'YOUR-DEPLOYMENT-NAME-HERE' #This will correspond to the custom name you chose for your deployment when you deployed a model.
}

$headers = [ordered]@{
    'api-key' = $openai.api_key
}

$text = 'Your text string goes here'

$body = [ordered]@{
    input = $text
	model = $openai.name
} | ConvertTo-Json

$url = "$($openai.api_base)/openai/v1/embeddings"

$response = Invoke-RestMethod -Uri $url -Headers $headers -Body $body -Method Post -ContentType 'application/json'
return $response.data.embedding

curl https://YOUR_RESOURCE_NAME.openai.azure.com/openai/v1/embeddings \
  -H 'Content-Type: application/json' \
  -H 'api-key: YOUR_API_KEY' \
	-d '{"model": "YOUR-DEPLOYMENT-NAME", "input": "Sample Document goes here"}'

Best practices

Tip

Embedding requests return HTTP 400 when the sum of input tokens exceeds 300,000, even if every individual input is well under the per-input limit. If you previously batched large arrays of long inputs successfully, split them into smaller requests.

Verify inputs don't exceed the maximum length

The maximum length of input text for the latest embedding models is 8,192 tokens. Verify that your inputs don't exceed this limit before making a request.
If you send an array of inputs in a single embedding request, the maximum array size is 2,048.
Each /embeddings request also has a hard cap of 300,000 tokens summed across all inputs. Requests that exceed this aggregate limit fail with HTTP 400, even when every individual input is under 8,192 tokens and the array length is under 2,048. Batch large workloads into multiple smaller requests to stay under the cap.
When you send an array of inputs in a single request, remember that the number of tokens per minute in your requests must stay below the quota limit assigned to the model deployment. By default, the latest generation 3 embeddings models are subject to a 350 K TPM per region limit.

Troubleshooting

If you get a 401 or 403 error, confirm the API key is valid for the resource.
If you get a 404 error, confirm the endpoint includes the /openai/v1/ path and you used the correct base URL.
If you get a 400 error, confirm model is set to your deployment name and the request body is valid JSON.

Limitations & risks

Embedding models might be unreliable or pose social risks in certain cases. They might cause harm if used without mitigations. For more information about how to approach their use responsibly, see the Responsible AI content.

Next steps

To learn more about using Azure OpenAI and embeddings to perform document search, see the embeddings tutorial.
To learn more about the underlying models that power Azure OpenAI.
To store your embeddings and perform vector (similarity) search, choose from the following services:

Feedback

Was this page helpful?

Last updated on 2026-05-13