英文:
MongoDB $text search with only negated terms
问题
如何使用$text查询操作符来查找不包含一系列禁止词的文档?这些文档不需要包含任何特定内容,只是不能包含这些词。
这是一个相当常见的用例,例如用于过滤不当言论,但是MongoDB文档中陈述,没有任何解释或解决方法:
当传递一个只包含否定词的搜索字符串时,文本搜索将不会匹配任何文档。
英文:
How can the $text query operator be used to find documents not containing a list of forbidden words? The documents don't need to contain anything specific; just none of those words.
This is a pretty common use case, e.g. for profanity filtering, but the MongoDB documentation states, without any explanation or workarounds, that
> When passed a search string that only contains negated words, text search will not match any documents.
答案1
得分: 3
没有支持此功能的MongoDB,我猜所有的解决方案都将是hack。
这是我的解决方案:
我会在我的集合中创建一个虚拟字段,具有相同的静态值,像这样:"dummy":"x"
。然后将此字段添加到文本索引中。最后,在查询中添加这个虚拟值x以克服以下限制:
当传递一个只包含否定词的搜索字符串时,文本搜索将不匹配任何文档
db.articles.insert(
[
{ _id: 1, subject: "coffee", dummy: "x" },
{ _id: 2, subject: "Coffee Shopping", dummy: "x" },
{ _id: 3, subject: "Baking a cake", dummy: "x" },
{ _id: 4, subject: "baking", dummy: "x" },
{ _id: 5, subject: "Cafe Con Cake", dummy: "x" },
{ _id: 6, subject: "ice cream", dummy: "x" },
{ _id: 7, subject: "coffee and cream", dummy: "x" }
]
)
我们将虚拟字段添加到文本索引中。
db.articles.createIndex( { subject: "text", dummy:"text" } )
我们在查询中添加了x
:
db.articles.find( { $text: { $search: "x -cream -cake" } } ).projection({"dummy":0})
结果将如下,不包含禁止词cream
和cake
:
{
"_id" : 4,
"subject" : "baking"
},
{
"_id" : 2,
"subject" : "Coffee Shopping"
},
{
"_id" : 1,
"subject" : "coffee"
}
英文:
Without MongoDB doesn't support this feature, I guess all the solutions will be hack.
And here is mine:
I would create a dummy field to my collection with the same static value, like "dummy":"x"
. And add this field to the text index. And lastly adding this dummy value x to the query to overcome the limitation of:
> When passed a search string that only contains negated words, text
> search will not match any documents
db.articles.insert(
[
{ _id: 1, subject: "coffee", dummy: "x" },
{ _id: 2, subject: "Coffee Shopping", dummy: "x" },
{ _id: 3, subject: "Baking a cake", dummy: "x" },
{ _id: 4, subject: "baking", dummy: "x" },
{ _id: 5, subject: "Cafe Con Cake", dummy: "x" },
{ _id: 6, subject: "ice cream", dummy: "x" },
{ _id: 7, subject: "coffee and cream", dummy: "x" }
]
)
We are adding dummy field to the text index.
db.articles.createIndex( { subject: "text", dummy:"text" } )
We are adding x
to the query:
db.articles.find( { $text: { $search: "x -cream -cake" } } ).projection({"dummy":0})
The result will be like this without the forbidden words cream
and cake
:
{
"_id" : 4,
"subject" : "baking"
},
{
"_id" : 2,
"subject" : "Coffee Shopping"
},
{
"_id" : 1,
"subject" : "coffee"
}
答案2
得分: 1
$text
操作符需要至少一个包含词来匹配。之后,您可以在包含词之后添加任意多个禁止词,如下所示:
db.articles.find({
$text: {
$search: "coffee -cream -shop"
}
})
我猜这是 MongoDB 文本搜索引擎的限制。
所以,另一种方法是这样做:
db.articles.find(
{
subject: {
$not: {
$in: [/cream/i, /shop/i]
}
}
}
)
英文:
the $text
operator requires at least one inclusive word to match. you can then have as many forbidden words as you like after the inclusion like so:
db.articles.find({
$text: {
$search: "coffee -cream -shop"
}
})
i guess it's a limitation of mongodb's text search engine.
so, the alternative would be to do this:
db.articles.find(
{
subject: {
$not: {
$in: [/cream/i, /shop/i]
}
}
}
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论