官方API的例子,这里仅仅是简单翻译了一下注释。
To use Lucene, an application should:
1.Create Documents by adding Fields;
2.Create an IndexWriter and add documents to it with addDocument();
3.Call QueryParser.parse() to build a query from a string; and
4.Create an IndexSearcher and pass the query to its search() method.
例1
注意:例子内部使用了断言,所以,请使用JUnit来对结果进行验证
package com.test.caoxs.l5Filter;
import static org.junit.Assert.assertEquals;
import java.io.IOException;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.junit.Test;
public class L5SimpleTester {
@Test
public void L5SimpleTester() throws IOException, ParseException {
Analyzer analyzer = new StandardAnalyzer();
// Store the index in memory:
// 将索引存放在内存中
Directory directory = new RAMDirectory();
// To store an index on disk, use this instead:
// 将索引存放在磁盘上,请替换成下面这一行
// Directory directory = FSDirectory.open("/tmp/testindex");
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Document doc = new Document();
String text = "This is the text to be indexed.";
doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
iwriter.addDocument(doc);
iwriter.close();
// Now search the index:
DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
// Parse a simple query that searches for "text":
QueryParser parser = new QueryParser("fieldname", analyzer);
Query query = parser.parse("text");
ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
//因为使用了断言,所以,要使用JUnit,来执行
assertEquals(1, hits.length);
// Iterate through the results:
for (int i = 0; i < hits.length; i++) {
Document hitDoc = isearcher.doc(hits[i].doc);
assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));
}
ireader.close();
directory.close();
}
}
例2
如果想要,对不同的field,进行不同的分词。
使用PerFieldAnalyzerWrapper官方的例子。
通过对 analyzerPerField 的配置, aWrapper 代替例1中的分词器(即变量analyzer)。
API Example usage:
Map<String,Analyzer> analyzerPerField = new HashMap<>();
//对firstname使用CJK分词器
analyzerPerField.put("firstname", new CJKAnalyzer());
//对lastname使用空格分词器
analyzerPerField.put("lastname", new WhitespaceAnalyzer());
//对于Lucene5的版本,不需要填写version
PerFieldAnalyzerWrapper aWrapper =
new PerFieldAnalyzerWrapper(new StandardAnalyzer(version), analyzerPerField);