Weaviate向量数据库是否无法对过滤后的结果进行分页?

huangapple go评论81阅读模式
英文:

Is it impossible to paginate filtered results with the Weaviate vector DB?

问题

查询过滤器工作

使用Weaviate的查询过滤器很好用,例如从他们的教程中:

response = (
    client.query
    .get("JeopardyQuestion", ["question", "answer", "round"])
    .with_where({
        "path": ["round"],
        "operator": "Equal",
        "valueText": "Double Jeopardy!"
    })
    .with_limit(20)
    .do()
)

分页出现问题?

但是获取前20个结果对于一个完全功能的语义搜索功能来说并不实用。我们需要分页来获取接下来的20个结果,以及接下来的20个,依此类推。Weaviate使用查询游标来实现这一点,其中提供了记录的UUID:

response = (
    client.query
    .get("JeopardyQuestion", ["question", "answer", "round"])
    .with_where({
        "path": ["round"],
        "operator": "Equal",
        "valueText": "Double Jeopardy!"
    })
    .with_after("6726aaa8-818b-49dc-8fea-9bc646ddfed6")   # <-- ID游标分页
    .with_limit(20)
    .do()
)

但是这会引发一个错误:

> where cannot be set with after and limit parameters

错误来自于Weaviate代码的这一行,而with_after()文档中说它 "需要设置限制,但不能与任何其他过滤器或搜索组合"。

所以我们不能像这样组合过滤器、游标和限制参数。

在进行筛选的查询分页时,正确的方法是什么呢?

英文:

Query Filter Works

Using Weaviate's query filtering works fine, e.g. from their tutorial:

response = (
    client.query
    .get("JeopardyQuestion", ["question", "answer", "round"])
    .with_where({
        "path": ["round"],
        "operator": "Equal",
        "valueText": "Double Jeopardy!"
    })
    .with_limit(20)
    .do()
)

Pagination Explodes?

But grabbing the first 20 results is not useful for a full-featured semantic search feature. We need pagination to grab the next 20 results, and the next 20, and so on. Weaviate uses query cursors to do this, where a record's UUID is provided:

response = (
    client.query
    .get("JeopardyQuestion", ["question", "answer", "round"])
    .with_where({
        "path": ["round"],
        "operator": "Equal",
        "valueText": "Double Jeopardy!"
    })
    .with_after("6726aaa8-818b-49dc-8fea-9bc646ddfed6")   # <-- ID cursor pagination
    .with_limit(20)
    .do()
)

But this throws an error:
> where cannot be set with after and limit parameters

The error comes from this line of Weaviate code, and the with_after() docs say it "requires limit to be set but cannot be combined with any other filters or search."

So we can't combine filters, cursors, and limit parameters like this.

What is the correct way to do filtered query pagination?

答案1

得分: 2

你可以使用 with_limit()with_offset() 方法一起进行分页。

例如,假设在数据库中有10条记录属于SomeClass类,你想要使用一次查询检索前5条记录,然后使用下一次查询检索接下来的5条记录。

第一次查询:

client.query.get("SomeClass", "someProp").with_offset(0).with_limit(5).do()

第二次查询:

client.query.get("SomeClass", "someProp").with_offset(5).with_limit(5).do()
英文:

you can paginate using the with_limit() and with_offset() methods together.

For example, let's say you have 10 records in the DB for a given SomeClass class, and you would like to retrieve the first 5 records with one query, and the next 5 with a following query.

First query:

client.query.get("SomeClass", "someProp").with_offset(0).with_limit(5).do()

Second query:

client.query.get("SomeClass", "someProp").with_offset(5).with_limit(5).do()

huangapple
  • 本文由 发表于 2023年8月11日 02:05:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76878289.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定