英文:
Filter not working as expected for Elastic search
问题
我的Elasticsearch文档如下所示:
{
"_index": "terms",
"_type": "_doc",
"_id": "30028861",
"_version": 36,
"_seq_no": 338988,
"_primary_term": 6,
"found": true,
"_source": {
"title": {
"value": "信用卡号",
"is_changed": false
},
"node_id": 30028861,
"domain_id": 30028838,
"node_name": "信用卡号",
"parent_id": 30028839,
"term_name": "信用卡号",
"account_id": 30028807,
"published_status": "已发布",
"published_status_id": "PUB",
"node_sub_type": "TRM"
}
}
我正在尝试添加过滤子句并使用must_not条件。
我的查询如下:
{
"from": 0,
"size": 10,
"track_total_hits": true,
"query": {
"bool": {
"must": [
{
"match": {
"account_id": 30028807
}
}
],
"filter": [
{
"bool": {
"should": [
{
"terms": {
"domain_id": [
30119066,
30123338
]
}
},
{
"terms": {
"parent_id": [
30028839
]
}
}
]
}
},
{
"bool": {
"must_not": [
{
"terms": {
"published_status_id": [
"PUB"
]
}
},
{
"terms": {
"published_status": [
"已发布"
]
}
}
]
}
}
],
"must_not": {
"terms": {
"published_status.raw": [
"已禁用"
]
}
}
}
},
"_source": [
"node_type",
"node_id",
"parent_name",
"table_name"
]
}
因此,如在must_not中提到的,如果published_status_id为PUB且published_status为已发布,文档不应返回,但它返回了所有包含这两者的文档。使用过滤子句的初衷是当domain_id和parent_id存在时,且published_status_id不是PUB且published_status不是已发布时才返回文档,但它返回了状态为PUB的文档。
英文:
My Elastic search document look like this :
{
"_index": "terms",
"_type": "_doc",
"_id": "30028861",
"_version": 36,
"_seq_no": 338988,
"_primary_term": 6,
"found": true,
"_source": {
"title": {
"value": "Credit Card Number",
"is_changed": false
},
"node_id": 30028861,
"domain_id": 30028838,
"node_name": "Credit Card Number",
"parent_id": 30028839,
"term_name": "Credit Card Number",
"account_id": 30028807,
"published_status": "Published",
"published_status_id": "PUB",
"node_sub_type": "TRM",
}
}
And I am trying to add Filter clause with must_not condition
My query is :
{
"from":0,
"size":10,
"track_total_hits":true,
"query":{
"bool":{
"must":[
{
"match":{
"account_id":30028807
}
}
],
"filter":[
{
"bool":{
"should":[
{
"terms":{
"domain_id":[
30119066,
30123338
]
}
},
{
"terms":{
"parent_id":[
30028839
]
}
}
]
}
},
{
"bool":{
"must_not":[
{
"terms":{
"published_status_id":[
"PUB"
]
}
},
{
"terms":{
"published_status":[
"Published"
]
}
}
]
}
}
],
"must_not":{
"terms":{
"published_status.raw":[
"Disabled"
]
}
}
}
},
"_source":[
"node_type",
"node_id",
"parent_name",
"table_name"
]
}
So As In must_not mentioned that if published_status_id in PUB and published_status in Published document should not return but it is returning all docs with containing both . whole intention to use filter clause is either domain_id and parent_id found and published_status_id Not in PUB and published_status Not in Published. but it's returning status with PUB
答案1
得分: 1
我看到你有published_status.raw
,这可能是一个关键字
,因此,published_status
可能是一个文本
字段。基于此,must_not
中的子句可能应该是:
add this
{
"terms": {
"published_status.raw": [
"Published"
]
}
}
并且确保检查published_status_id
,它可能存在相同的问题,你可能需要使用published_status_id.raw
以进行精确的terms
匹配。
英文:
I see you have published_status.raw
which is probably a keyword
, and thus, published_status
is probably a text
field. In light of this, the clause in the must_not
should probably be
add this
{ |
"terms":{ v
"published_status.raw":[
"Published"
]
}
}
And also make sure to check published_status_id
which might suffer from the same issue, you might need to use published_status_id.raw
in order to make an exact terms
match.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论