Tuning Solr

At some point, you may need to tune Solr¹ to handle heavier indexing and searching loads; this is especially true for production environments with heavy search index usage. This page will look at some of the configuration options available in order to optimize and make Solr production-ready. The tuning methods presented below are intended as a general guide to optimize a Solr server; it is still important to consider your organization's use case before committing to a different configuration.

Monitoring performance

Before making any changes, it is important to know which parts of Solr may need further optimization. To give you a head start, you may check your Solr instance's performance statistics and log files; both of which are accessible via the Solr Admin UI (not available on the embedded version). To get to the Solr Admin UI, follow the steps below:

Solr Plugin Metrics

Navigate to http://<host>:<port>/solr².
Select a Solr core to examine.
Click Plugins/Stats.
Select a plugin type to view its metrics.

JVM

Among the most important configurations for tuning the JVM is memory allocation³. While it is advisable to allocate more memory than the bare minimum that Solr needs, it is also important to ensure sufficient memory is available to the operating system (OS) as it improves performance by caching files from the Solr index. A good rule of thumb is to give Solr the memory it needs, add some extra, and leave the rest to the OS. The dashboard of the Solr Admin UI provides information about how much memory the Solr instance uses. To configure memory allocation in the JVM, set the -Xms argument to set the initial memory size and the -Xmx argument to set the maximum heap size. These arguments can be configured in the include file (solr.in.sh or solr.in.cmd) located in /bin/solr.

Update handlers

Update handlers make documents searchable by committing changes to the index. When doing commits, handlers either do a normal commit or a soft commit (as of Solr 4).

Doing a normal commit writes the index files to the storage disk and opens a new searcher where the newly committed data is available. This is an expensive operation due to the processes involved. It is also called a "hard commit". On the other hand, a soft commit is a less expensive operation; it makes data quickly searchable by not writing to the disk immediately. It is Solr's implementation of a near real time (NRT) search.

Whilst a soft commit seems more viable, at some point a hard commit is still needed to ensure the durability of data. The timing and behavior of commits can affect the performance of a Solr server. To configure Solr's behavior on commits, use the solrconfig.xml file.

File descriptors

When Lucene performs incremental indexing, changes are written to new files. Solr keeps most of these files open at the same time. Unfortunately, this can exceed the limit of open files and file descriptions on Unix-based operating systems which means the server will most likely crash. To fix this, use the command: ulimit -n [number]. It is recommended to configure this to at least 65000 or unlimited, depending on the limits of your OS. The current limit of the system can be checked by running ulimit -a or via the Solr Web UI Admin page.

Distributed indexing and searching

An index may become too large to fit on a single node or system. Solr supports horizontal scaling which is also known as sharding. Sharding distributes the index to multiple systems, each which is known as a shard. When performing a query, Solr will handle the merging of results from each shard as if it were a single index. These capabilities are collectively referred to as SolrCloud. SolrCloud specializes in distributed search and indexing. SolrCloud is recommended for scaling as it supports automatic load-balancing, fault tolerance, and other important features that allow for a distributed architecture. Traditional sharding is also possible but its configuration will be more complex. Read these resources to learn more on how to configure sharding:

Learn how to migrate to SolrCloud

Martini features extensive documentation on how to shift to SolrCloud, from configuring the ZooKeeper ensemble, to configuring the cluster of Solr servers, and configuring Martini itself.

Further tuning

Most settings on Solr can be calibrated via the solrconfig.xml file. The default solrconfig.xml, located at /server/solr/configsets/_default/conf contains documentation of the fields that you can configure. Solr also has a detailed page called "Apache Solr Reference Guide: The Well-Configured Solr Instance". For performance troubleshooting, Apache Solr has a wiki that lays out the factors that affect a Solr server's performance.

Martini uses Solr in custom search indices, the Monitor search index, and the Tracker search index. ↩
Substitute <host> with the hostname or IP address of your Solr server and <port> with your Solr server's port. ↩
See also Tuning the JVM. ↩

Toro Cloud Dev Center