2011年1月21日 星期五

10.9.1. Our Lucene Implementation

10.9.1. Our Lucene Implementation
Currently we have our own Analyzer and Tokenizer classes (DSAnalyzer and DSTokenizer) to customize our indexing. They invoke the stemming and stop word features within Lucene. We create an IndexReader for each
query, which we now realize isn't the most efficient use of resources - we seem to run out of filehandles on really heavy
loads. (A wildcard query can open many filehandles!) Since Lucene is thread-safe, a better future implementation
would be to have a single Lucene IndexReader shared by all queries, and then is invalidated and re-opened when the
index changes. Future API growth could include relevance scores (Lucene generates them, but we ignore them,) and
abstractions for more advanced search concepts such as booleans.

沒有留言:

張貼留言