Configure vector search.
For Lucidworks Fusion Services
Get in touch with us
Let's break ice
Email Us
Configure Vector Search - Semantic Vector Search (SVS)
1. Put Semantic Vector Search in Place
Versions 5.5, 5.9, 5.10, 5.11, and 5.12 are compatible.
Utilizes: Fusion
Solr SVS is incompatible with these instructions, which are for Milvus.
2. Recommended Steps to Begin Using Semantic Vector Search (SVS)
Step 1: Create Training Data
SVS models use training data to establish links between catalog items and customer inquiries. The largest impact on the quality and performance that an SVS model may provide is the training data.
Step 2: Examine the Themes of Zero Search Results
In order to create training data pairs that are appropriate for the types of problem themes, it is necessary to first identify the common themes among the top zero search result queries. This is done by looking at a report of the top zero search result queries and classifying them into categories:
- Available/Unavailable
- Not Carried Product
- Foreign Words
- Spelling Errors
Step 3: Make Training Pairs with Zero Search Results
The most obvious sign that a search engine is having trouble returning relevant results is a Zero Search Result Query (ZRQ). Develop training data pairs designed to find out what users were searching for when they submitted phrases like these. This involves:
- Examine the products that a customer adds to their cart after seeing a page with no results.
- A subset will attempt to search for a ZSR query in a different way.
- This process is known as "Learning from the Persistent Shopper."
- Generate ZSR Query and Next Item Added-to-Cart pairs.
Step 4: Make Training Pairs for Abandoned Search
Abandoned Searches (AS) are another sign of problematic queries or low relevancy. When a customer submits a search query and gets search results, but chooses not to click on any of them, this is known as an AS. Create pairs of the form:
- Next Item Added-to-Cart query syntax.
Step 5: Converted Search Training Pairs
Converted Search (CS) describes the "happy path" a customer follows when they find exactly what they're looking for in the search results and then add that item to their cart. This type of training data is crucial for improving the SVS model's precision and accuracy. It's also the easiest training data to generate.
Step 6: Apply a Filter on It
The final stage in the training data creation process involves applying a filter to eliminate noisy data pairs. Look for pairs with low aggregation counts and eliminate them specifically to reduce noise.
Step 7: Get the SVS Models Trained
Use this training data to construct SVS deep learning models using a supervised learning methodology. The pre-trained models serve as a foundation for understanding connections between products in the catalog and are refined using the carefully constructed training data.
Step 8: Implement the Model
After the training job is finished, deploy the model using Seldon-Core, an open-source tool for managing and deploying models inside a Kubernetes cluster.
Step 9: Product Catalog Index
Encoding the product catalog into a vector space involves generating a vector for every product in the catalog and storing them in Milvus for vector similarity searches.
Step 10: Use an Index Pipeline to Integrate
Add the "Encode into Milvus" step to the index pipeline in order to vectorize products and store them inside a Milvus collection. Ensure that a matching milvus_id is kept in the product record of the Solr collection.
Step 11: Create a Job in Milvus
Name the collection and the job, then enter the Dimension parameter. This parameter is usually double the RNN Function Units List parameter.
Step 12: Refresh the Model
Periodically update the learned model using the latest consumer training data to reindex the catalog and refresh the SVS models.
Step 13: Connect to a Pipeline of Queries
Integrate vector search into query pipelines to prepare for production and load testing. Ensure that:
- Vector search only starts when the main query pipeline returns zero results.
- Search results preserve sorting and faceting capabilities.
- Searches using vector search components are marked for analytics.
Step 14: Test the Load
Test load according to requirements and consider adjusting configuration settings if the vector search setup doesn't perform adequately. This includes resource allocation and disabling Solr spell check.