es实体类中对应的字段
/**
* 向量,用于knn,其中type要设置为dense vector,其他参数参考文档,
* 建议和我的一致,similarity设置为你需要的相似函数(欧几里得或者cos)
* www.elastic.co/guide/en/elasticsearch/reference/8.12/dense-vector.html
*/
@Field(type = FieldType.Dense_Vector, dims = 1024, index = true, similarity = "cosine")
private List<Float> vector;
项目启动后spring data es会根据你的实体类生成对应的索引,可以查看该索引中相应字段的配置:
"vector": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "cosine"
}
knn搜索代码:
BoolQuery.Builder bool = QueryBuilders.bool().boost(1.0f);
Query multiMatch = QueryBuilders.multiMatch()
.fields("name", "synonyms")
.query(name)
.fuzziness(condition.getFuzziness())
.operator(Operator.And)
.type(TextQueryType.BestFields)
.build()
._toQuery();
bool.must(multiMatch);
Query query = new Query(bool.build());
//调用embedding服务生成向量
Double[] v = EmbeddingUtils.vector(name);
List<Float> vector = Stream.of(v).map(Double::floatValue).toList();
//构造knn查询条件
KnnQuery knn = KnnQuery.of(k -> k.field("vector")
.boost(0.5f)
.k(20)
.numCandidates(6000)
.queryVector(vector)
);
NativeQueryBuilder nativeQueryBuilder = NativeQuery.builder()
.withPageable(condition.toPage())
.withQuery(query)
.withSort(scoreSort, timeSort, idSort)
.withTrackTotalHits(true)
.withKnnQuery(knnQuery)
.build();
SearchHits<Product> hits = esTemplate.search(nativeQuery, Product.class);
注意事项
spring data es要用最新版5.2.3以上,否则有bug