Vector search support

Home » Vector Search Support

Let's break ice

Email Us

For Expert Vector Search Support Consulting

Get in touch with us

Azure AI Search Vectors

One method for information retrieval that facilitates indexing and query execution across numerical representations of material is vector search. Matching is based on vectors that are most similar to the query vector because the content is numeric rather than plain text. This allows matching across:

conceptual or semantic similarity (“dog” and “canine” are conceptually similar but linguistically different).
bilingual content (“hund” in German and “dog” in English)
several different kinds of material (“dog” in plain text and a picture of a dog in an image file)

An overview of vectors in Azure AI Search is given in this article. Along with covering vocabulary and concepts related to vector search programming, it also addresses integration with other Azure services.

For background information, we suggest reading this article, but if you’d rather to get started, take these actions:

Either supply embeddings for your index or have an indexer pipeline create them.
Make a vector index.
Execute vector queries

Which situations are supported by vector search?

Vector search scenarios include:

Search for similarities: Use open source models like SBERT or embedding models like OpenAI embeddings to encode text, and use vector-encoded queries to retrieve documents.

Perform a multimodal search across several content categories: Utilize multimodal embeddings to encode text and images (for instance, OpenAI CLIP or GPT-4 Turbo with Vision in Azure OpenAI) and query an embedding space made up of vectors from both kinds of content.

Hybrid search: Hybrid search in Azure AI Search is the execution of both vector and keyword queries in a single request. With an index that includes both vector fields and searchable text fields, vector support is implemented at the field level. The results are combined into a single response after the queries run concurrently. Using the same language models that underpin Bing, you can optionally use semantic ranking for increased accuracy in L2 reranking.

Search in multiple languages: By embedding models and conversation models trained in several languages, it is feasible to provide a search experience in the user’s native tongue. The multi-language features that Azure AI Search offers for nonvector information can be added for hybrid search scenarios if you require more control over translation.

Vector search using filters: A vector query and a filter expression can both be included in a query request. Text and numeric fields can be filtered using filters, which are helpful for metadata filters and for excluding or including search results according to filter requirements. You can create a filterable text or numeric field even when a vector field isn’t filterable in and of itself. The filter can be processed by the search engine either before or after the vector query runs.

Vector database: The data that you query over is stored by Azure AI Search. Use it as a pure vector store for any application that needs vectors, long-term memory, or a knowledge base. It can also be used to ground data for RAG architecture.

How Azure AI Search's vector search operates

Indexing, storing, and querying vector embeddings from a search index are all included in vector support.

The vector search indexing and query procedures are depicted in the following diagram.

In terms of indexing, Azure AI Search employs a closest neighbors method to group related vectors in an index based on vector embeddings. It generates vector indexes for every vector field internally.

On the query side, you gather user input for the query in your client application, typically via a prompt procedure. After converting the input into a vector through an encoding phase, you may send the vector query to your Azure AI Search index for a similarity search. You can use the inbuilt vectorization to turn the query into a vector, just like with indexing. Azure AI Search provides documents with the desired k nearest neighbors (kNN) in the results for both methods.

Azure AI Search facilitates hybrid scenarios, which combine vector and keyword searches to produce a single result set that frequently yields superior results than either search method alone. For hybrid searches that run side by side, both vector and nonvector material are absorbed into the same index.

Pricing and availability

All Azure AI Search tiers in every location provide vector search at no additional cost.

Higher vector index quotas are supported by more recent services developed after April 3, 2024.

You can use vector search in:

Azure portal: Data vectorization and import wizard
REST APIs for Azure
Azure SDKs for Python, JavaScript, and.NET
other Azure products, including Azure AI Foundry.

Take note

Certain outdated search services that were developed prior to January 1, 2019, are installed on systems that aren’t capable of handling vector workloads. Outdated services are the cause of errors that appear when you attempt to add a vector field to a schema. To test the vector feature in this case, you have to build a new search service.

Azure integration and associated services

The Azure AI platform is intricately linked with Azure AI Search. Several that are helpful in vector workloads are included in the following table.

Expand table

Product	Integration
Azure AI Foundry	In the chat with your data playground, Add your own data uses Azure AI Search for grounding data and conversational search. This is the easiest and fastest approach for chatting with your data.
Azure OpenAI	Azure OpenAI provides embedding models and chat models. Demos and samples target the text-embedding-ada-002. We recommend Azure OpenAI for generating embeddings for text.
Azure AI Services	Image Retrieval Vectorize Image API(Preview) supports vectorization of image content. We recommend this API for generating embeddings for images.
Azure data platforms: Azure Blob Storage, Azure Cosmos DB	You can use indexers to automate data ingestion, and then use integrated vectorization to generate embeddings. Azure AI Search can automatically index vector data from two data sources: Azure blob indexers and Azure Cosmos DB for NoSQL indexers. For more information, see Add vector fields to a search index..

It’s also commonly used in open-source frameworks like LangChain.

Additionally, open-source frameworks like LangChain frequently use it.

Concepts of vector search

This section outlines some fundamental ideas for those who are unfamiliar with vectors.

Concerning vector search

A technique for retrieving information called “vector search” uses vectors rather than plain text to represent documents and queries. Machine learning models are used in vector search to create vector representations of source inputs, which may include text, photos, or other types of content. A common foundation for search situations is provided by the mathematical representation of content. Even if the original content is in a different language or medium than the query, a query can still find a match in vector space if everything is a vector.

The rationale behind vector search

Close matches in similar content can be found by a query when searchable content is represented as vectors. The vector generation process employs an embedding model that recognizes comparable words and concepts and aligns the generated vectors in the embedding space. For instance, even if they are not lexical matches, vectorized source papers on “clouds” and “fog” are more likely to appear in a query about “mist” because to their semantic similarity.

Using vectorization and embeddings

Machine learning algorithms that capture the semantic meaning of text or representations of other information, including images, provide a particular kind of vector representation of content or a query called an embedding. To find patterns and connections between words, natural language machine learning models are trained on vast volumes of data. In an intermediate phase known as the encoder, they learn to represent any input as a vector of real values during training. These language models can be altered so that the intermediate vector representation becomes the model’s output once training is finished. According to Understand embeddings (Azure OpenAI), the generated embeddings are high-dimensional vectors, implying that words with comparable meanings are closer together in the vector space.

The ability of the embedding model to extract the meaning of documents and queries into the resulting vector determines how well vector search retrieves pertinent information. The top models have received extensive training on the kinds of data they are meant to represent. You can refine a general-purpose model, provide your own model that has been trained directly on the issue space, or evaluate pre-existing models like Azure OpenAI text-embedding-ada-002. Select the model that works best for your data because Azure AI Search doesn’t place any restrictions on it.

It’s critical to consider input size constraints while developing efficient embeddings for vector search. Prior to creating embeddings, we advise adhering to the chunking data requirements. By following this best practice, vector search becomes more effective and the embeddings are guaranteed to accurately capture the pertinent information.

What is the space for embedding?

For vector inquiries, the corpus is called embedding space. An embedding space is made up of all the vector fields in a search index that are filled with embeddings from the same embedding model. By mapping individual words, phrases, documents (for natural language processing), images, or other types of data into a representation made up of a vector of real numbers that represent a coordinate in a high-dimensional space, machine learning algorithms generate the embedding space. Similar objects are grouped together in this embedding space, whereas dissimilar items are grouped farther apart.

Documents discussing various dog species, for instance, would be grouped together in the embedding space. Documents regarding cats would be in the same neighborhood as those about dogs, but they would be farther apart. Cloud computing and other dissimilar ideas would be much farther away. The fundamental principle remains the same even though these embedding spaces are abstract in practice and lack clear, understandable meanings.

Search for the closest neighbors

In vector search, the search engine looks for vectors that are closest to the query vector by scanning the embedding space. This method is known as “nearest neighbor search.” Nearest neighbors aid in measuring how similar two items are to one another. A high vector similarity level suggests that the original data was comparable as well. The search engine reduces the search space by performing optimizations or using data structures and data partitioning to enable quick nearest neighbor searches. Every vector search method optimizes for least latency, maximum throughput, recall, and memory in a particular way for solving the closest neighbor problems. Similarity metrics offer the means of calculating distance in order to calculate similarity.

The following algorithms are currently supported by Azure AI Search:

One of the best ANN algorithms for high-recall, low-latency applications when the data distribution is unpredictable or subject to frequent changes is Hierarchical Navigable Small World (HNSW). It allows for an adjustable trade-off between search precision and computing cost by arranging high-dimensional data points into a hierarchical graph structure that facilitates quick and scalable similarity search. The approach uses the vector index size quota since it necessitates that all data points be stored in memory for quick random access.

Exhaustive K-nearest neighbors (KNN): Determines the separations between each data point and the query vector. Because of its high processing cost, it performs best on smaller datasets. This approach doesn’t employ the vector index size quota since it doesn’t require quick random access to data points. Nonetheless, the global collection of nearest neighbors is provided by this algorithm.

For details on how to specify the algorithm, vector profiles, and profile assignment for these algorithms, see Create a vector field.

After the index is constructed, it is impossible to modify the algorithm parameters that were used to establish it during index generation. Nonetheless, it is possible to change the settings that influence the query-time properties (efSearch).

Furthermore, with the query request parameter “exhaustive”: true, fields that define the HNSW algorithm also allow exhaustive KNN search. But the contrary isn’t true. You cannot use HNSW in a query that has a field indexed for exhaustiveKnn since there are no additional data structures that allow for effective search.

Nearest Neighbors, roughly

A class of techniques for locating matches in vector space is called approximate nearest neighbor (ANN) search. In order to speed up query processing, this family of algorithms uses various data structures or data partitioning techniques to drastically limit the search space.

In contemporary information retrieval applications, ANN algorithms are perfect for striking a balance between accuracy and efficiency since they provide scalable and faster retrieval of approximate nearest neighbors at the expense of some precision. You can fine-tune your algorithm’s parameters to meet your search application’s memory, disk footprint, recall, and latency needs.

HNSW is used by Azure AI Search’s ANN algorithm.

ERP

ERP