英文:
Is it impossible to paginate filtered results with the Weaviate vector DB?
问题
查询过滤器工作
使用Weaviate的查询过滤器很好用,例如从他们的教程中:
response = (
client.query
.get("JeopardyQuestion", ["question", "answer", "round"])
.with_where({
"path": ["round"],
"operator": "Equal",
"valueText": "Double Jeopardy!"
})
.with_limit(20)
.do()
)
分页出现问题?
但是获取前20个结果对于一个完全功能的语义搜索功能来说并不实用。我们需要分页来获取接下来的20个结果,以及接下来的20个,依此类推。Weaviate使用查询游标来实现这一点,其中提供了记录的UUID:
response = (
client.query
.get("JeopardyQuestion", ["question", "answer", "round"])
.with_where({
"path": ["round"],
"operator": "Equal",
"valueText": "Double Jeopardy!"
})
.with_after("6726aaa8-818b-49dc-8fea-9bc646ddfed6") # <-- ID游标分页
.with_limit(20)
.do()
)
但是这会引发一个错误:
> where cannot be set with after and limit parameters
错误来自于Weaviate代码的这一行,而with_after()
文档中说它 "需要设置限制,但不能与任何其他过滤器或搜索组合"。
所以我们不能像这样组合过滤器、游标和限制参数。
在进行筛选的查询分页时,正确的方法是什么呢?
英文:
Query Filter Works
Using Weaviate's query filtering works fine, e.g. from their tutorial:
response = (
client.query
.get("JeopardyQuestion", ["question", "answer", "round"])
.with_where({
"path": ["round"],
"operator": "Equal",
"valueText": "Double Jeopardy!"
})
.with_limit(20)
.do()
)
Pagination Explodes?
But grabbing the first 20 results is not useful for a full-featured semantic search feature. We need pagination to grab the next 20 results, and the next 20, and so on. Weaviate uses query cursors to do this, where a record's UUID is provided:
response = (
client.query
.get("JeopardyQuestion", ["question", "answer", "round"])
.with_where({
"path": ["round"],
"operator": "Equal",
"valueText": "Double Jeopardy!"
})
.with_after("6726aaa8-818b-49dc-8fea-9bc646ddfed6") # <-- ID cursor pagination
.with_limit(20)
.do()
)
But this throws an error:
> where cannot be set with after and limit parameters
The error comes from this line of Weaviate code, and the with_after()
docs say it "requires limit to be set but cannot be combined with any other filters or search."
So we can't combine filters, cursors, and limit parameters like this.
What is the correct way to do filtered query pagination?
答案1
得分: 2
你可以使用 with_limit()
和 with_offset()
方法一起进行分页。
例如,假设在数据库中有10条记录属于SomeClass
类,你想要使用一次查询检索前5条记录,然后使用下一次查询检索接下来的5条记录。
第一次查询:
client.query.get("SomeClass", "someProp").with_offset(0).with_limit(5).do()
第二次查询:
client.query.get("SomeClass", "someProp").with_offset(5).with_limit(5).do()
英文:
you can paginate using the with_limit()
and with_offset()
methods together.
For example, let's say you have 10 records in the DB for a given SomeClass
class, and you would like to retrieve the first 5 records with one query, and the next 5 with a following query.
First query:
client.query.get("SomeClass", "someProp").with_offset(0).with_limit(5).do()
Second query:
client.query.get("SomeClass", "someProp").with_offset(5).with_limit(5).do()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论