使用 BooleanQuery 还是编写更多索引?

2022-01-15 00:00:00 lucene java lucene.net

这样的类别树:

root_1
  sub_1
  sub_2
  ... to sub_20 

每个文档都有一个子类别(如 sub_2).现在,我只在lucene索引中写了sub_2:

Every document has a sub category(like sub_2). Now, I only wrote sub_2 in lucene index:

new NumericField("category",...).setIntValue(sub_2.getID());

我想获取所有 root_1 的文档,使用 BooleanQuery(将 sub_1 合并到 sub_20)在每个条目文档中搜索或写入其他类别:

I want to get all root_1's documents, using BooleanQuery (merge the sub_1 to sub_20) to search or write an other category in every entry document:

new NumericField("category",...).setIntValue(sub_2.getID());
new NumericField("category",...).setIntValue(root_1.getID());//sub_2's ancestor category

哪个是更好的选择?

推荐答案

我将使用类别层次结构的路径枚举/'Dewey Decimal' 表示.也就是说,不是只为第一个根的第二个孩子存储sub_2",而是存储类似001.002"的东西.

I would use a path enumeration/'Dewey Decimal' representation of the category hierarchy. That is, instead of just storing 'sub_2' for the second child of the first root, store instead something like '001.002'.

要查找根及其所有子项,您可以搜索category:001*".

To find the root and all of its children, you would search on "category:001*".

要仅查找根的子项,您可以搜索category:001.*".

To find only the children of the root, you would search on "category:001.*".

(另请参阅 如何将树数据存储在 Lucene/Solr/Elasticsearch 索引或 NoSQL 数据库中?.)

相关文章