欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 文旅 > 艺术 > [ElasticSearch]Suggest查询建议(自动补全纠错)

[ElasticSearch]Suggest查询建议(自动补全纠错)

2025/4/21 2:03:12 来源:https://blog.csdn.net/qq_41860765/article/details/147272594  浏览:    关键词:[ElasticSearch]Suggest查询建议(自动补全纠错)

概述

搜索一般都会要求具有“搜索推荐”或者叫“搜索补全”的功能,即在用户输入搜索的过程中,进行自动补全或者纠错。以此来提高搜索文档的匹配精准度,进而提升用户的搜索体验,这就是Suggest。

四种Suggester

1 Term Suggester

Term Suggester: term suggester正如其名,只基于tokenizer之后的单个term去匹配建议词,并不会考虑多个term之间的关系,对给入的文本进行分词,为每个词进行模糊查询提供词项建议。(建议对搜索词进行长度控制,超过长度则不会进行TermSuggest,原因也是一般Term Suggester适用于单个词使用 把得分最高的推荐词进行返回代表纠错)

POST <index>/_search
{ "suggest": {"<suggest_name>": {"text": "<search_content>","term": {"suggest_mode": "<suggest_mode>","field": "<field_name>"}}}
}

在这里插入图片描述

2 Phrase Suggester

Phrase Suggester:phrase suggester和term suggester相比,对建议的文本会参考上下文,也就是一个句子的其他token,不只是单纯的token距离匹配,它可以基于共生和频率选出更好的建议。在term的基础上,会考量多个term之间的关系,比如是否同时出现在索引的原文里,相邻程度,以及词频等。

如果说term suggester建议处理单个词的纠错 那么Phrase Suggester就建议作为一整句话的纠错(返回值的suggest列表中返回的也是一整句话)
在这里插入图片描述

DELETE test
POST test/_bulk
{ "index" : { "_id":1} }
{"title": "lucene and elasticsearch"}
{ "index" : {"_id":2} }
{"title": "lucene and elasticsearhc"}
{ "index" : { "_id":3} }
{"title": "luceen and elasticsearch"}POST test/_search
GET test/_mapping
POST test/_search
{"suggest": {"text": "Luceen and elasticsearhc","simple_phrase": {"phrase": {"field": "title.trigram","max_errors": 2,"gram_size": 1,"confidence":0,"direct_generator": [{"field": "title.trigram","suggest_mode": "always"}],"highlight": {"pre_tag": "<em>","post_tag": "</em>"}}}}
}

3 completion suggester

自动补全,自动完成,支持三种查询【前缀查询(prefix)模糊查询(fuzzy)正则表达式查询(regex)】 ,主要针对的应用场景就是"Auto Completion"。 此场景下用户每输入一个字符的时候,就需要即时发送一次查询请求到后端查找匹配项,在用户输入速度较高的情况下对后端响应速度要求比较苛刻。因此实现上它和前面两个Suggester采用了不同的数据结构,索引并非通过倒排来完成,而是将analyze过的数据编码成FST和索引一起存放。对于一个open状态的索引,FST会被ES整个装载到内存里的,进行前缀查找速度极快。但是FST只能用于前缀查找,这也是Completion Suggester的局限所在。
在这里插入图片描述

DELETE suggest_carinfo
PUT suggest_carinfo
{"mappings": {"properties": {"title": {"type": "text","analyzer": "ik_max_word","fields": {"suggest": {"type": "completion","analyzer": "ik_max_word"}}},"content": {"type": "text","analyzer": "ik_max_word"}}}
}POST _bulk
{"index":{"_index":"suggest_carinfo","_id":1}}
{"title":"宝马X5 两万公里准新车","content":"这里是宝马X5图文描述"}
{"index":{"_index":"suggest_carinfo","_id":2}}
{"title":"宝马5系","content":"这里是奥迪A6图文描述"}
{"index":{"_index":"suggest_carinfo","_id":3}}
{"title":"宝马3系","content":"这里是奔驰图文描述"}
{"index":{"_index":"suggest_carinfo","_id":4}}
{"title":"奥迪Q5 两万公里准新车","content":"这里是宝马X5图文描述"}
{"index":{"_index":"suggest_carinfo","_id":5}}
{"title":"奥迪A6 无敌车况","content":"这里是奥迪A6图文描述"}
{"index":{"_index":"suggest_carinfo","_id":6}}
{"title":"奥迪双钻","content":"这里是奔驰图文描述"}
{"index":{"_index":"suggest_carinfo","_id":7}}
{"title":"奔驰AMG 两万公里准新车","content":"这里是宝马X5图文描述"}
{"index":{"_index":"suggest_carinfo","_id":8}}
{"title":"奔驰大G 无敌车况","content":"这里是奥迪A6图文描述"}
{"index":{"_index":"suggest_carinfo","_id":9}}
{"title":"奔驰C260","content":"这里是奔驰图文描述"}
{"index":{"_index":"suggest_carinfo","_id":10}}
{"title":"nir奔驰C260","content":"这里是奔驰图文描述"}GET suggest_carinfo/_search?pretty
{"suggest": {"car_suggest": {"prefix": "奥迪","completion": {"field": "title.suggest"}}}
}

4 context suggester

完成建议者会考虑索引中的所有文档,但是通常来说,我们在进行智能推荐的时候最好通过某些条件过滤,并且有可能会针对某些特性提升权重。
在这里插入图片描述

# context suggester
# 定义一个名为 place_type 的类别上下文,其中类别必须与建议一起发送。
# 定义一个名为 location 的地理上下文,类别必须与建议一起发送
DELETE place
PUT place
{"mappings": {"properties": {"suggest": {"type": "completion","contexts": [{"name": "place_type","type": "category"},{"name": "location","type": "geo","precision": 4}]}}}
}PUT place/_doc/1
{"suggest": {"input": [ "timmy's", "starbucks", "dunkin donuts" ],"contexts": {"place_type": [ "cafe", "food" ]                    }}
}
PUT place/_doc/2
{"suggest": {"input": [ "monkey", "timmy's", "Lamborghini" ],"contexts": {"place_type": [ "money"]                    }}
}

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

热搜词