Due to the increasing complexity of many dynamic sites and installations as well as user demand for more robust searching systems, many system administrators and developers are looking to implement more complex search abilities into their websites and systems. Two software solutions have risen to the top of the pack in this regard: Apache Solr and elasticsearch. Both are good systems in their own right, but which is better for you and your use scenario? Let’s look at a rundown of their individual capabilities and find out.
Right off the bat, Apache Solr is quite a bit easier to deploy than elasticsearch due to the 3rd-party product integration that Solr comes with support for – Drupal, Magento, Django, ColdFusion, WordPress, OpenCMS, Plone, Typo3, ez Publish, Riak (via Yokozuna). Compare that to elasticsearch’s out-of-the-box support for only Django and Couchbase, and you can see that Solr can be immediately deployed and integrated to a wider range of systems than elasticsearch can. Solr also allows for a bit more flexibility in terms of format- you can use XML, CSV, and JSON while elasticsearch only allows for JSON.
This sort of immediate flexibility in Solr’s initial deployment makes it very attractive for companies or corporations that need an immediate full-text solution that can handle a variety of formats right out of the gate. These advantages, however, are considerably lessened when a larger deployment is considered or when there is time allotted for a proper integration of a full-text search product: some of Solr’s inherent design capabilities make it a bit more limited when it comes to scaling up to larger enterprise deployments.
Solr’s lack of support for complex documents and lack of support for multiple document types per schema, for example, will start to pose problems for larger enterprises as they scale up. The same goes for its distribution handling and schema handling – both are inflexible, as they are set at initial index creation and cannot be altered. elasticsearch allows for both online schema changes and the moving and changing of the number of shards per deployment, which gives it the ability to be more flexible and change according to the growing complexity of an enterprise’s search needs.
So what does this mean for you and your organization? The quick and simple of it is this: Apache Solr is probably a bit easier to implement and deploy than elasticsearch for a small or medium-sized organization. In fact, for an organization of that size, Solr and elasticsearch will actually more than likely both suffice to handle any full-text indexing that your organization needs.
If you are anticipating a large growth in capacity or need, especially a scenario that requires distributed indexing, however, you would more than likely be better off with elasticsearch. While a little more difficult to deploy initially, its more powerful and flexible distributed ability means that it will be easier to scale and retool an elasticsearch deployment to your needs as your organization grows in size and complexity. As always, take a look at each and decide what you need – a little bit of research goes a long way towards being happy with a deployment!
|Apache Solr: http://lucene.apache.org/solr/|