Indexing a document to a custom search index
In this guide, we will show you how to add a new document to your custom search index. We will be creating a script that indexes movie data, and we will discuss the objects and methods used in said script that enabled indexing. For simplicity's sake, the data we're going to index will be manually entered via service inputs.
Stuff you need to know...
Get the code!
The scripts mentioned in this guide are available in the
As bonus, you can find other services in the
examples package that demonstrate
the use of functions from the
as well as other Solr-related functionality.
Here's the outline of our set-up:
- Our package is called
examples. This is where our scripts will reside.
Our target Solr core is embedded and named
movie-core. As a result, the directory structure of the
1 2 3 4 5 6 7 8 9 10 11
examples ├── classes ├── code ├── conf ├── web └── solr └── movie-core └── core.properties └── conf └── schema.xml └── solrconfig.xml
package.xmlfile has already been edited to make the embedded Solr core known:
1 2 3 4 5 6 7
<package> <!-- ... --> <solr-cores> <solr-core name="movie-core" enabled="true" /> <!-- ... --> </solr-cores> </package>
Creating the model
We need to create a model that can hold the data we want to index. In this case, we need to create a model for holding movie data.
You can manually create your Gloop model from scratch, or
you can extract the fields defined in the
schema.xml file to create a model based from it. In our case, we will do
the latter using the
We have placed this script in
code directory, under
solr.customSolrCore.model. You should
be able to use this script to parse your own
schema.xml file. Depending on your setup, you may need to tweak
it a little more. Here's a breakdown of the Gloop steps it contains:
In Line 2, we have another map step that declares and initializes a
Pathvariable that points to
schema.xml's location. We'll use
martiniPackage#getHome()as the base path and from there, we can traverse to
schema.xml's actual location, like so:
Paths.get(esbPackage.getHome(), 'solr', 'movie-core', 'conf', 'schema.xml')
In Line 3, we have added a third map step but this time, we use it to declare and initialize a
schema.xml's content. We did that this way:
You may have noticed that the last line of code read in a
bytearray, but the variable was a string. This is possible thanks to the Gloop
In Line 4, we create an invoke step that calls
SolrMethods.solrSchemaToGloopModel(String, String, String, String, List<GloopModel>). This method will create the Gloop model
solr.customSolrCore.model, based on the
SolrMethods.solrSchemaToGloopModel("MovieDocument", schemaContent, null, "solr.customSolrCore.model", null)
All you have to do now is run the service and voila!
You now have your
schema.xml-based Gloop model! If you're
following through our example, this will produce
solr.customSolrCore.model. We'll use
this model later.
1 2 3 4 5 6 7
MovieDocument model would have the following fields:
In this case, the Groovy bean class
MovieDocument.groovy will hold the movie data we want to index. We'll place it
solr.customSolrCore.model package. Its content will be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
@Field annotations indicate which fields we want to index.
Fields defined in the schema
If you will take a look at
schema.xml file, you will notice that its documents are defined so that
it has six fields:
idis the identifier for our documents and whose value is automatically generated by Solr due to the
_version_is, once again, a property whose value is automatically supplied by Solr and is an internal field used by the partial update procedure, update log process, and by SolrCloud; this field is required to perform optimistic concurrency
textis a compilation of copied fields, and is used as the default search field when clients do their queries
The other fields are provided by the client.
Indexing the model's data
Since our model is ready, we can now create a service that gets and indexes the model's data. We'll populate our models manually to make things simpler.
Insert in bulk
You can use the
to insert documents in bulk.
MovieIndexer service will be responsible for indexing our
MovieDocument's data. Here's a preview of
the steps we will have in this service:
MovieIndexer's sole input parameter is called
movieDocument, based on the
MovieDocument Gloop model we
created earlier. Because of this, we will be prompted to enter four fields when we
run the service:
Martini will build the
movieDocument parameter from our inputs and
from there, we can index
SolrMethods.index(String, String, GloopModel).
The bullet points below explain each step in the service:
- In Line 1, we have a
catchblock step. This allows Gloop to mirror Java's
catchwhere it wraps the code that could possibly throw an exception in a
tryblock, and perform a "rescue" in the
- In Line 3, under the
tryblock, we have an invoke step that calls
SolrMethods.index(String, String, GloopModel). This is where the actual indexing will happen. It'll index
movieDocumentso that it will be available for querying in
movie-coreSolr core later.
- In Line 5, we have another invoke step that calls
LoggerMethods.error(String); this time, under the
catchblock. This will just log the exception if anything goes wrong whilst indexing.
Running the service will prompt you to populate the required
MovieIndexer model. You can enter whatever values you
want to index. The service, if invoked successfully, should return a response similar to below:
This time, we'll create an endpoint whose parameters are to be mapped to the
MovieDocument bean's fields. We can
just call this Spring-based endpoint and the indexing will take place.
Simply create a Groovy file named
solr.customSolr and edit it so that it contains the code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
As you may notice, in:
- Line 25, we constructed a
documentvariable) from the parameters of our request.
- Line 26, we used
SolrMethods.index(String, String, GloopModel)method, a function, to index the data for us. We subsequently called the
GloopMethod#toString()method so that our endpoint's response is the indexed
With that said, a call to the endpoint will trigger the indexing of your movie data. For example:
1 2 3 4
Try out the service via the service invoker
You can click on the run button shown at the beginning of the signature of a method to run the method.