Now a days, immediate data availability is a key requirement of most real world applications.
Organisations around the globe have to make data available as soon as possible for users in order to
stay current with the customer’s demands. Solr, in particular, is extremely capable to manage data
and make it available asap for search. Solr’s core engine is quite capable and efficient to make the
content searchable soon after the ingestion.
Solr has a concept known as “Near Real Time” search and also referred as NRT. Since SolrCloud, NRT
added as a main feature and it is configurable which tells how long it will take the ingested data to
become searchable. Prior to SolrCloud NRT was not available in much efficient manner.
Within solrconfig.xml there is a concept of “commits” which is directly related to NRT and it controls
the document durability and search-ability. Two type of Commits exists in Solr known as “hard” and
“soft”. Hard/Soft commit can be issued by a client (ex. SolrJ), via a REST call or can be configured to
occur automatically in solrconfig.xml. Typically in NRT applications, hard commits are configured
with openSearcher=false, and soft commits are configured to make documents visible for search.
When a commit occurs, various background tasks are initiated, however, these background tasks do
not block additional updates to the index nor do they delay the availability of the documents for
When configuring for NRT, pay special attention to cache and autowarm settings as they can have a
significant impact on NRT performance. For extremely short autoCommit intervals, consider
disabling caching and autowarming completely.
Examples of NRT Search Applications
- New Articles
- Legal Data
- Citizen Data
- Insurance Data
Commits and Searching
As we know “ hard” and “soft” are two type of commits exists in Solr. Below is details on both.
A hard commit calls fsync on the index files to ensure they have been flushed to stable storage.
Optionally a hard commit can also make documents visible for search, but this is not recommended
for NRT searching as it is more expensive than a soft commit.
A soft commit is faster since it only makes index changes searchable and does not fsync index files.
Search collections that have NRT requirements will want to soft commit often enough to satisfy the
visibility requirements of the application. A softCommit may be “less expensive” than a hard commit
(openSearcher=true), but it is not free. It is recommended that this be set for as long as is reasonable
given the application requirements.
Both hard and soft commits have two primary configuration parameters: maxDocs and maxTime