Django: 为相关表构建动态的Q查询

huangapple go评论88阅读模式
英文:

Django: Building dynamic Q queries for related tables

问题

我尝试根据相关对象来过滤表中的对象,但在这方面遇到了问题。

我有一个表格**Run**:

class Run(models.Model):
    start_time = models.DateTimeField(db_index=True)
    end_time = models.DateTimeField()

每个**Run对象都有一个相关表RunValue**:

class RunValue(models.Model):
    run = models.ForeignKey(Run, on_delete=models.CASCADE)
    run_parameter = models.CharField(max_length=50)
    value = models.FloatField(default=0)

在**RunValue**中,我们存储了运行的详细特征,称为run_parameter,例如电压、温度、压力等。

为了简单起见,让我们假设我要过滤的字段是"Min. Temperature"和"Max. Temperature"。

例如:

Run 1:

  • Run Values:
    • run_parameter: "Min. Temperature", value: 430
    • run_parameter: "Max. Temperature", value: 436

Run 2:

  • Run Values:
    • run_parameter: "Min. Temperature", value: 627
    • run_parameter: "Max. Temperature", value: 671

Run 3:

  • Run Values:
    • run_parameter: "Min. Temperature", value: 642
    • run_parameter: "Max. Temperature", value: 694

Run 4:

  • Run Values:
    • run_parameter: "Min. Temperature", value: 412
    • run_parameter: "Max. Temperature", value: 534

RunValue.value是浮点数,但为了简单起见,我们将其保留为整数)。

我的页面上有两个HTML输入,用户在其中输入最小和最大温度。它们都可以留空,或者只填写一个,或者两个都填写,因此它是一个开放式的筛选,可以定义要筛选的范围,也可以不筛选。例如,如果用户输入:

Min. temperature = 400
Max. temperature = 500

那么这组筛选条件应该只返回上述**Run实例示例中的Run 1**,其中下限大于400且上限小于500。所有其他**Run**都不符合条件。

因此,我需要返回所有**Run**对象实例,其中RunValue与用户输入的筛选条件匹配。

以下是我的尝试:

# 从请求中获取温度范围并设置默认的最小和最大筛选值:
temp_ranges = [0, 999999]  # 默认值,以防用户没有设置任何内容

if min_temp_filter:
    temp_ranges = [min_temp_filter, 999999]

if max_temp_filter:
    temp_ranges = [0, max_temp_filter]

if min_temp_filter and max_temp_filter:
    temp_ranges = [min_temp_filter, max_temp_filter]

# 开始Q查询
temp_q_queries = [
    Q(runvalue__run_parameter__icontains='Min. Temperature'),
    Q(runvalue__run_parameter__icontains='Max. Temperature')
]

queryset = models.Q(reduce(operator.or_, temp_q_queries), runvalue__value__range=temp_ranges)
filtered_run_instances = Run.objects.filter(queryset)

运行这个代码会得到一些结果,但不是预期的结果。它返回了Run 1Run 4,但只应该返回Run 1

问题可能出在这里:

queryset = models.Q(reduce(operator.or_, temp_q_queries), runvalue__value__range=temp_ranges)

这个查询会检查temp_ranges范围内的值,但不会同时考虑最小和最大温度。筛选需要同时查看两个范围,最小值和最大值。

我认为你需要进行如下伪代码所示的筛选:

所有具有RunValue实例的Run,其中RunValue.run_parameter是"Min. Temperature"或"Max. Temperature",且RunValue.value介于400和500之间

然后,我认为应该将值范围作为普通的Django筛选包括在Q查询中,用逗号分隔:

temp_q_queries = [
    Q(runvalue__run_parameter__icontains='Min. Temperature', runvalue__value__range=temp_ranges),
    Q(runvalue__run_parameter__icontains='Max. Temperature', runvalue__value__range=temp_ranges)
]

queryset = models.Q(reduce(operator.or_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)

但结果仍然相同,所以问题可能不在于值范围,而在于逻辑分组(我认为如此)。

因此,我尝试进行两个reduce Q查询(看起来有点复杂),以便表示:

所有具有RunValue实例的Run,其中名称是"Min. Temperature"且值大于400,以及所有具有RunValue实例的Run,其中名称是"Max. Temperature"且值小于500

temp_q_queries = [
    models.Q(reduce(operator.and_, [Q(runvalue__run_parameter__icontains='Min. Temperature'), Q(runvalue__value__gte=temp_ranges[0])])),
    models.Q(reduce(operator.and_, [Q(runvalue__run_parameter__icontains='Max. Temperature'), Q(runvalue__value__lte=temp_ranges[1])]))
]

queryset = models.Q(reduce(operator.and_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)

(注意所有三个reduce都更改为AND门)。

这导致了0次命中。

使用相同的复合reduce方法对temp_q_queries进行筛选,但将外部逻辑门更改为OR,结果仍然是相同的错误结果,即Run 1Run 4

queryset = models.Q(reduce(operator.or_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)

也许我在这里过于复杂化了,有一些非常简单的东西我没有看到(我已经试图解决这个逻辑难题2天了,开始有点视野狭窄了。但我希望它是可解的,而且很简单。

任何帮助

英文:

[EDIT]

I have created an example Django Repl.it playground preloaded with this exact case:

https://repl.it/@mormoran/Django-Building-dynamic-Q-queries-for-related-tables

[/EDIT]

I'm trying to filter objects on a table, based on related objects, but having trouble doing so.

I have a table Run:

class Run(models.Model):
	start_time = models.DateTimeField(db_index=True)
	end_time = models.DateTimeField()

Each Run object has related table RunValue:

class RunValue(models.Model):
	run = models.ForeignKey(Run, on_delete=models.CASCADE)
	run_parameter = models.CharField(max_length=50)
	value = models.FloatField(default=0)

In RunValue we store detailed characteristics of a run, called a run_parameter. Things such as, voltage, temperature, pressure, etc.

For simplicity's sake, let's assume the fields I want to filter on are "Min. Temperature" and "Max. Temperature".

So for example:

Run 1:
    Run Values:
        run_parameter: "Min. Temperature", value: 430
        run_parameter: "Max. Temperature", value: 436

Run 2:
    Run Values:
        run_parameter: "Min. Temperature", value: 627
        run_parameter: "Max. Temperature", value: 671

Run 3:
    Run Values:
        run_parameter: "Min. Temperature", value: 642
        run_parameter: "Max. Temperature", value: 694

Run 4:
    Run Values:
        run_parameter: "Min. Temperature", value: 412
        run_parameter: "Max. Temperature", value: 534

(RunValue.value are floats, but let's keep it to ints for simplicity).

I have two HTML inputs in my page where the user enters min and max (for temperatures). They can both be left blank, or just one, or both, so it's an open ended filter, it can define a range to filter on, or not. For example, if the user were to input:

Min. temperature = 400
Max. temperature = 500

That set of filters should only return Run 1 from the above Run instance examples, where the lower threshold is above 400 and the upper threshold is below 500. All other Run do not qualify.

So then I need to return all Run object instances where RunValue matches the filters the user has input.

This is what I have tried:

# Grabbing temp ranges from request and setting default filter mins and maxs:
temp_ranges = [0, 999999] # Defaults in case the user does not set anything

if min_temp_filter:
	temp_ranges = [min_temp_filter, 999999]

if max_temp_filter:
	temp_ranges = [0, max_temp_filter]

if min_temp_filter and max_temp_filter:
	temp_ranges = [min_temp_filter, max_temp_filter]

# Starting Q queries
temp_q_queries = [
	Q(runvalue__run_parameter__icontains='Min. Temperature'),
	Q(runvalue__run_parameter__icontains='Max. Temperature')
]

queryset = models.Q(reduce(operator.or_, temp_q_queries), runvalue__value__range=temp_ranges)
filtered_run_instances = Run.objects.filter(queryset)

Running that yields some results, but not the desired results. It returns Run 1 and Run 4, when it should only return Run 1.

The temp_ranges being from 400 to 500, Run 1 qualifies, but the max temperature of Run 4 is over 500, it should NOT qualify. The filters need to exclude object instances by looking at both ranges at the same time, the minimum AND the maximum.

The printed query is as follows:

(AND: (OR: ('runvalue__run_parameter__icontains', 'Min. Temperaure'), ('runvalue__run_parameter__icontains', 'Max. Temperature')), ('runvalue__value__range', ['400', '500']))

What I think I need to be filtering in pseudo code:

All Runs that have RunValue instances where the RunValue.run_parameter is either "Min. Temperature" OR "Max. Temperature" AND the RunValue.value are between 400 and 500.

I then thought I should be including the value ranges in the Q queries as a regular Django filter, separated by commas:

temp_q_queries = [
	Q(runvalue__run_parameter__icontains='Min. Temperature', runvalue__value__range=temp_ranges),
	Q(runvalue__run_parameter__icontains='Max. Temperature', runvalue__value__range=temp_ranges)
]

queryset = models.Q(reduce(operator.or_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)

Same result, so the value range is not the issue there, it's the logic grouping (I think?).

So I tried to do two reduce Q queries (a bit gnarly looking), so as to say:

All Runs that have RunValue instances where the name is "Min. Temperature" AND the values are higher than 400, AND all Runs that have RunValue instances where the name is "Max. Temperature" AND the values are lower than 500

temp_q_queries = [
	models.Q(reduce(operator.and_, [Q(runvalue__run_parameter__icontains='Min. Temperature'), Q(runvalue__value__gte=temp_ranges[0])]),
	models.Q(reduce(operator.and_, [Q(runvalue__run_parameter__icontains='Max. Temperature'), Q(runvalue__value__lte=temp_ranges[1])]))
]

queryset = models.Q(reduce(operator.and_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)

(Note all 3 reduce where changed to AND gates)

This yielded 0 hits.

Using the same compound reduce method for temp_q_queries but changing the outter logic gate for queryset to OR yields the same wrong results, Run 1 and Run 4:

queryset = models.Q(reduce(operator.or_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)

Perhaps I am over complicating myself here and there's something very simple I'm not seeing (I've been trying to solve this logic puzzle for 2 days now, getting a bit of tunnel vision. But I'm rather hoping it's solvable, and simple.

Any help or questions would be greatly appreciated.

答案1

得分: 4

你的问题是你需要同时满足两个条件,而且它们在RunValue相关表的同一行上永远都不同时有效。你想要选择具有在该范围内的"Min. Temperature"行和类似的"Max. Temperature"有效行的根对象。你必须使用子查询。

最好使用Django 3.0 **Exists()**子查询条件。它可以轻松定制为旧版本的Django。

一个具体的示例

from django.db.models import Exists, OuterRef

queryset = Run.objects.filter(
    Exists(RunValue.objects.filter(
        run=OuterRef('pk'),
        run_parameter='Min. temperature',
        value__gte=400)),
    Exists(RunValue.objects.filter(
        run=OuterRef('pk'),
        run_parameter='Max. temperature',
        value__lte=500)),
)

同样,也有一个通用解决方案,因为你想要一个动态过滤器:

filter_data = {
    'Min. temperature': 400,
    'Max. temperature': 500,
}

param_operators = {
    'Min. Temperature': 'gte',
    'Max. Temperature': 'lte',
    # 更多支持的参数... 例如 'some boolean as 0 or 1': 'eq'.
}

conditions = []
for key, value in filter_data.items():
    if value is not None:
        conditions.append(Exists(RunValue.objects.filter(
            run=OuterRef('pk'),
            run_parameter=key,
            **{'value__{}'.format(param_operators[key]): value}
        )))
queryset = Run.objects.filter(*conditions)

你知道"Min. Temperature" <= "Max. Temperature",但数据库优化器不知道。我通过删除一个多余的范围条件来进行了优化。完全删除无用的条件"Max. Temperature" <= 999999 也更好。

这个答案可以轻松定制为Django >=1.11 <= 2.2 **Exists()**条件,只要你阅读了那份文档的大约十行。

在这种简单情况下,你不需要一个Q()对象,即使你想要通过简短的一行表达式重写它并添加有助记的临时变量。

编辑 具体示例可以以以下方式重写,以适用于Django < 3.0:

queryset = Run.objects.annotate(
    min_temperature_filter=Exists(RunValue.objects.filter(
        run=OuterRef(&#39;pk&#39;),
        run_parameter=&#39;Min. temperature&#39;,
        value__gte=400)),
    max_temperature_filter=Exists(RunValue.objects.filter(
        run=OuterRef(&#39;pk&#39;),
        run_parameter=&#39;Max. temperature&#39;,
        value__lte=500)),
).filter(
    min_temperature_filter=True,
    max_temperature_filter=True,
)
英文:

Your problem was that you need to meet both conditions and they are never valid both on the same row of RunValue related table. You want to select root objects that have a row "Min. Temperature" in that range and similarly also a "Max. Temperature" valid row. You must use subqueries.

The best is to use Django 3.0 Exists() subquery condition. It can be easily customized for an old Django.

A concrete example:

from django.db.models import Exists, OuterRef

queryset = Run.objects.filter(
    Exists(RunValue.objects.filter(
        run=OuterRef(&#39;pk&#39;),
        run_parameter=&#39;Min. temperature&#39;,
        value__gte=400)),
    Exists(RunValue.objects.filter(
        run=OuterRef(&#39;pk&#39;),
        run_parameter=&#39;Max. temperature&#39;,
        value__lte=500)),
)

The same by a general solution, because you want a dynamic filter:

filter_data = {
    &#39;Min. temperature&#39;: 400,
    &#39;Max. temperature&#39;: 500,
}

param_operators = {
    &#39;Min. Temperature&#39;: &#39;gte&#39;,
    &#39;Max. Temperature&#39;: &#39;lte&#39;,
    # much more supported parameters... e.g. &#39;some boolean as 0 or 1&#39;: &#39;eq&#39;.
}

conditions = []
for key, value in filter_data.items():
    if value is not None:
        conditions.append(Exists(RunValue.objects.filter(
            run=OuterRef(&#39;pk&#39;),
            run_parameter=key,
            **{&#39;value__{}&#39;.format(param_operators[key]): value}
        )))
queryset = Run.objects.filter(*conditions)

You know that "Min. Temperature" <= "Max. Temperature", but the database optimizer doesn't know it. I optimized it by removing a superfluous condition of the range. It is also better to completely remove a useless condition "Max. Temperature" <= 999999.

This answer can be easily customized for Django >=1.11 <= 2.2 Exists() condition after you read approximately tens lines of that documentation.

You don't need a Q() object in this simple case, even if you want to rewrite it by short one-line expressions and add mnemonic temporary variables.


EDIT The concrete example can be rewritten for Django < 3.0 this way

queryset = Run.objects.annotate(
    min_temperature_filter=Exists(RunValue.objects.filter(
        run=OuterRef(&#39;pk&#39;),
        run_parameter=&#39;Min. temperature&#39;,
        value__gte=400)),
    max_temperature_filter=Exists(RunValue.objects.filter(
        run=OuterRef(&#39;pk&#39;),
        run_parameter=&#39;Max. temperature&#39;,
        value__lte=500)),
).filter(
    min_temperature_filter=True,
    max_temperature_filter=True,
)

huangapple
  • 本文由 发表于 2020年1月3日 23:55:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/59581548.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定