Lucene 文档域加权

准备的文档资源：

private String ids[]={"1","2","3","4"};
private String authors[]={"Jack","Marry","John","Json"};
private String positions[]={"accounting","technician","salesperson","boss"};
private String titles[]={"Java is a good language.","Java is a cross platform language","Java powerful","You should learn java"};
private String contents[]={
"If possible, use the same JRE major version at both index and search time.",
"When upgrading to a different JRE major version, consider re-indexing. ",
"Different JRE major versions may implement different versions of Unicode,",
"For example: with Java 1.4, `LetterTokenizer` will split around the character U+02C6,"
};

第一步：获取到IndexWriter的实例

private Directory dir;

/**
* 获取到IndexWriter实例
*/
public IndexWriter getWriter() throws Exception{
dir = FSDirectory.open(Paths.get("D:\\lucene2"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter writer = new IndexWriter(dir, config);
return writer;
}

第二步：生成索引文件

/**
* 新建索引需要分词时使用TextField
* @throws Exception
*/
@Test
public void index() throws Exception{
IndexWriter writer = getWriter();
for(int i = 0;i<ids.length;i++){
Document doc = new Document();
doc.add(new StringField("id", ids[i], Field.Store.YES));
doc.add(new StringField("author", authors[i], Field.Store.YES));
doc.add(new StringField("position", positions[i], Field.Store.YES));
doc.add(new TextField("title", titles[i], Field.Store.YES));
doc.add(new TextField("content", contents[i], Field.Store.NO));
writer.addDocument(doc);
}
writer.close();
}

第三步：查询

/**
* 查询
* @throws Exception
*/
@Test
public void testSearch() throws Exception{
dir = FSDirectory.open(Paths.get("D:\\lucene2"));
IndexReader reader = DirectoryReader.open(dir);
IndexSearcher is = new IndexSearcher(reader);
String searchField = "title";
String str = "java";
Term term = new Term(searchField,str);
Query query = new TermQuery(term);
TopDocs hits = is.search(query, 10);
for(ScoreDoc scoreDoc : hits.scoreDocs){
Document document = is.doc(scoreDoc.doc);
System.out.println(document.get("author"));
}
reader.close();

}

没有经过加权默认的查询出的顺序为：

修改为当职位为boss时排序为第一个。此时修改生成索引的方法：

/**
* 新建索引给文档域加权改变查询出的位置
* @throws Exception
*/
@Test
public void reIndex() throws Exception{
IndexWriter writer = getWriter();
for(int i = 0;i<ids.length;i++){
Document doc = new Document();
doc.add(new StringField("id", ids[i], Field.Store.YES));
doc.add(new StringField("author", authors[i], Field.Store.YES));
doc.add(new StringField("position", positions[i], Field.Store.YES));
TextField field = new TextField("title", titles[i], Field.Store.YES);
if("boss".equals(positions[i])){
field.setBoost(1.5f);
}
doc.add(field);
doc.add(new TextField("content", contents[i], Field.Store.NO));
writer.addDocument(doc);
}
writer.close();
}

修改完重新查询之后结果变为：

修改field的权值可以改变最后查询的顺序 field.setBoost(1.5f) 默认为1 大于2后无效。且和lucene的版本有关

使用7.2.1时没有此方法。一次暂时使用5.3.1版本。

猜你喜欢