Searching in elvis has always been pretty fast, but with the upcoming 2.1 release it just got a whole lot faster.
Designed for speed
One of our design goals when we started development on elvis was to combine handling of millions of assets with high performance. Having first hand experience with a host of existing DAM systems we knew we had to take a new road, using modern concepts and technology. Instead of putting metadata in a database to search through, we put it into a Lucene search engine index.
A Lucene index works different than a typical database. It doesn’t need a very strict data schema, which means it is easy to add additional metadata fields – a common requirement in DAM systems. Text that’s indexed for searching is analyzed to improve results, recognizing and changing words, plurals, stop words like ‘at, the, it’ and special characters for different languages.
The most important difference is that a Lucene index scales much better than a database. Databases are designed to handle relational and highly structured data, but this complexity has its downside. When the amount of data grows to millions and millions of records, you need constant maintenance by skilled people to keep up performance. In Lucene, data is optimized automatically, giving it high performance without any maintenance at all.
Taking it to the next level
There were some challenges with Lucene though. Typical use for Lucene is high volume searching with infrequent updates to the index. In an application like elvis, there are continuous additions and changes to the index, people that are searching need to see these changes immediately. This worked fine on indexes with a few hundred thousand entries, but on big indexes, complex searching became slow when combined with sorting on specific fields.