Research in information retrieval (IR) is featured by its highly empirical
approaches to practical problems. To this end, a number of toolkits have
been developed to support experimentation in IR research. In this paper, we
describe our research-oriented, highly comprehensive and scalable IR system,
called LabLucene, which is adopted from the Apache Lucene to support IR research
in a laboratory environment.
LabLucene has the unique features as follows: First, it currently provides
the most comprehensive implementation of the state-of-the-art retrieval models including classical probabilistic
models and language models, thanks to its modular architecture. Second, it provides multiple level cache
dedicated for TREC alike experiments to improve the retrieval efficiency.
Third, it provides advanced logging during indexing and searching for analysis and debugging.
LabLucene is collaboratively developed by Dalian University of Technology in China and York University in Canada, which
can be downloaded here: http://www.zye.me/soft/LabLucene.zip