如何通过预取或缓存“无关”表来优化@property方法。

huangapple go评论52阅读模式
英文:

How can I optimize @property method by prefetching or caching an "unrelated" table

问题

I have three model objects, two of which directly relate to each other.

class Child(models.Model):
    date_of_birth = models.DateField()
    name = models.CharField(max_length=255)
    gender = models.CharField(max_length=10, choices=Gender.choices)

    @property
    def age(self):
        today = date.today()
        return age_on_date(date_calculating_on=today, date_of_birth=self.date_of_birth)
class ChildBmiCheck(models.Model):

    child = models.ForeignKey(Child, models.PROTECT, related_name="bmi_checks")
    height_in_meters = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    weight_in_kgs = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    child_was_present = models.BooleanField(default=True)

    @property
    def zscore(self):
        if self.child_was_present is True:
            age = self.child.age * 12  # in months
            gender = self.child.gender

            # TODO this is slow because for every child bmi check we hit the DB
            #  |
            #  V
            matching_data = ReferenceZscoreData.objects.filter(sex=gender).get(age=age)

            return self.calculate_z_score(matching_data)
        else:
            return None

And one that is used under the comment in the @property method above, which is not directly related through a foreign key relationship.

class ReferenceZscoreData(models.Model):
    sex = models.CharField(max_length=1, choices=Sex.choices, db_index=True)
    age = models.IntegerField(db_index=True)

    property1ForZscore = models.FloatField()
    property2ForZscore = models.FloatField()
    property3ForZscore = models.FloatField()

When I fetch Child BMI checks, I see that the DB is queried for the relevant ReferenceZscoreData every time this method is called. This makes the HTTP fetch in Django Rest Framework really slow. However, ReferenceZscoreData only changes once a year, so there is no reason to go to the database so often.

I tried adding this table to prefetch_related, but it isn't related through any ForeignKey relationship.

class ChildBmiCheckViewSet(viewsets.ModelViewSet):
    queryset = ChildBmiCheck.objects.prefetch_related(
        'child'
        # There is no reference_z_score_data because this is related through an @property only!
    ).all()
    serializer_class = ChildBmiCheckSerializer
    permission_classes = [permissions.AllowAny]

How can I prefetch or cache this value to optimize it?

英文:

I have three model objects, two of which directly relate to each other.

class Child(models.Model):
    date_of_birth = models.DateField()
    name = models.CharField(max_length=255)
    gender = models.CharField(max_length=10, choices=Gender.choices)

    @property
    def age(self):
        today = date.today()
        return age_on_date(date_calculating_on=today, date_of_birth=self.date_of_birth)
class ChildBmiCheck(models.Model):

    child = models.ForeignKey(Child, models.PROTECT, related_name="bmi_checks")
    height_in_meters = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    weight_in_kgs = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    child_was_present = models.BooleanField(default=True)

    @property
    def zscore(self):
        if self.child_was_present is True:
            age = self.child.age * 12  # in months
            gender = self.child.gender

            # TODO this is slow because for every child bmi check we hit the DB
            #  |
            #  V
            matching_data = ReferenceZscoreData.objects.filter(sex=gender).get(age=age)

            return self.calculate_z_score(matching_data)
        else:
            return None

And one that is used under the comment in the @property method above, which is not directly related through a foreign key relationship.

class ReferenceZscoreData(models.Model):
    sex = models.CharField(max_length=1, choices=Sex.choices, db_index=True)
    age = models.IntegerField(db_index=True)


    property1ForZscore = models.FloatField()
    property2ForZscore = models.FloatField()
    property3ForZscore = models.FloatField()

When I fetch Child BMI checks, I see that the DB is queried for the relavent ReferenceZscoreData every time this method is called. This makes the HTTP fetch in Django Rest Framework really slow. However, ReferenceZscoreData only changes once a year, so there is no reason to go to the database so often.

I tried adding this table to prefetch_related, but it isn't related through any ForeignKey relationship.

class ChildBmiCheckViewSet(viewsets.ModelViewSet):
    queryset = ChildBmiCheck.objects.prefetch_related(
        'child'
        # There is no reference_z_score_data because this is related through an @property only!
    ).all()
    serializer_class = ChildBmiCheckSerializer
    permission_classes = [permissions.AllowAny]

How can I prefetch or cache this value to optimize it?

答案1

得分: 1

以下是翻译的代码部分:

有多种方法可以在一次数据库查询中获取数据

但在继续之前我建议您向`ChildBmiCheck`添加一个`DateTimeField`或一个代表年龄的`IntegerField`,因为一个孩子可以有多个身高和体重记录每个记录链接到特定的时间戳

虽然以下方法仍然使用`Child.date_of_birth`属性

**方法1. 注释子查询**

```python
from django.db.models import Func, OuterRef, Subquery
from django.db.models.functions import Floor, Extract

subquery_ref = ReferenceZscoreData.objects.filter(age=OuterRef('age'), sex=OuterRef('child__gender'))

queryset = ChildBmiCheck.objects.annotate(
    # Extract(interval, 'epoch') 返回总秒数,然后除以3600,24,30返回小时,天,月
    age = Floor(Extract(Func('child__date_of_birth', function='AGE'), 'epoch') / 3600 / 24 / 30)
).annotate(
    prop1 = Subquery(subquery_ref.values('property1ForZscore')),
    prop2 = Subquery(subquery_ref.values('property2ForZscore')),
    prop3 = Subquery(subquery_ref.values('property3ForZscore'))
).select_related('child')

然后在您的ChildBmiCheck.zscore中:

@property
def zscore(self):
    if self.child_was_present:
        return self.calculate_z_score(self.age, self.child.gender, self.prop1, self.prop2, self.prop3)
    # 您可能不需要另一种方法来计算zscore
    # 只需在这里编写计算步骤。

请注意,在注释年龄时我使用了Func(field_name, function='AGE')。这只在您的数据库具有AGE函数时有效。PostgreSQL内建了它,而SQLite3没有。您需要自己编写一个自定义的age函数。

方法2. 材料化视图

材料化视图是一个将查询结果存储在临时表中的数据库功能,因此在下一次查询时,它直接提取数据,无需进行繁重的计算。

同样,SQLite3不支持它。抱歉。

创建材料化视图的语法因数据库而异。以PostgreSQL为例:

-- 1. 创建一个计算zscore的函数
CREATE FUNCTION zscore(age numeric, gender text, prop1 numeric, prop2 numeric, prop3 numeric) RETURNS numeric AS $$
    -- zscore 计算步骤
$$ LANGUAGE SQL;

-- 2. 创建材料化视图
CREATE MATERIALIZED VIEW myapp_childbmicheck_zscore_matview AS
SELECT b.*, zscore(floor(extract(epoch from date_trunc('month', age(c.date_of_birth))) / 3600 / 24 / 30), c.gender, r.property1ForZscore, r.property2ForZscore, r.property3ForZscore) as zscore
FROM myapp_childbmicheck b
LEFT JOIN myapp_child c ON c.id = b.child_id
LEFT JOIN myapp_referencezscore r ON r.sex = c.gender AND r.age = floor(extract(epoch from date_trunc('month', age(c.date_of_birth))) / 3600 / 24 / 30)
WITH DATA;

然后在您的myapp/models.py中,创建相应的模型:

class ChildBmiCheckZscore(models.Model):
    child = models.ForeignKey(Child, models.DO_NOTHING)
    height_in_meters = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    weight_in_kgs = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    child_was_present = models.BooleanField(default=True)
    zscore = models.DecimalField(max_digits=40, decimal_places=10, null=True)
    class Meta:
        managed = False
        db_table = 'myapp_childbmicheck_zscore_mat_view'

然后在您的视图中:

queryset = ChildBmiCheckZscore.objects.all()

请注意,您需要手动刷新材料化视图,例如:

REFRESH MATERIALIZED VIEW myapp_childbmicheck_zscore_matview;

您需要这样做不是每年一次,但至少每月一次,甚至每天一次。一个孩子的zscore可能一年变化一次,但您有多个孩子,生日在不同的月份。

方法3. 向ChildBmiCheck添加一个Zscore字段

这个方法非常简单直接。每次创建ChildBmiCheck实例时都计算zscore。


<details>
<summary>英文:</summary>

There are multiple ways to fetch the data in one db hit.

But before proceeding, I would suggest you add a `DateTimeField` or an `IntegerField` (representing age) to `ChildBmiCheck`, since one child can have multiple height and weight records, each links to a specific timestamp. 

Although following methods still uses `Child.date_of_birth` attribute.

**Method 1. Annotating Subquery**

    from django.db.models import Func, OuterRef, Subquery
    from django.db.models.functions import Floor, Extract

    subquery_ref = ReferenceZscoreData.objects.filter(age=OuterRef(&#39;age&#39;), sex=OuterRef(&#39;child__gender&#39;))

    queryset = ChildBmiCheck.objects.annotate(
        # Extract(interval, &#39;epoch&#39;) returns total number of seconds
        # then divide by 3600, 24, 30 returns hours, days, months respectively
        age = Floor(Extract(Func(&#39;child__date_of_birth&#39;, function=&#39;AGE&#39;), &#39;epoch&#39;) / 3600 / 24 / 30)
    ).annotate(
        prop1 = Subquery(subquery_ref.values(&#39;property1ForZscore&#39;)),
        prop2 = Subquery(subquery_ref.values(&#39;property2ForZscore&#39;)),
        prop3 = Subquery(subquery_ref.values(&#39;property3ForZscore&#39;))
    ).select_related(&#39;child&#39;)

And then in your `ChildBmiCheck.zscore`

    @property
    def zscore(self):
        if self.child_was_present:
            return self.calculate_z_score(self.age, self.child.gender, self.prop1, self.prop2, self.prop3)
        # You probably don&#39;t need another method to calculate zscore
        # Just write the calculation steps here.

Please note I use `Func(field_name, function=&#39;AGE&#39;)` when annotating age. This only works when your database has AGE function. PostgreSQL has it built in while SQLite3 does not. You will need to write a custom age function yourself then.

**Method 2. Materialized View**

Materialized View is a database functionality that stores a query result into a temporary table, so that on next query, it pull the data directly, without the heavy calculation.

Again, SQLite3 does not support it. Sorry.

Syntax of creating materialized view varies by database. Take PostgreSQL as example.

     -- 1. Create a function to calculate zscore
     CREATE FUNCTION zscore(age numeric, gender text, prop1 numeric, prop2  numeric, prop3 numeric) RETURNS numeric AS $$
         -- zscore calculation steps
     $$ LANGUAGE SQL;

     -- 2. Create materialized view
     CREATE MATERIALIZED VIEW myapp_childbmicheck_zscore_matview AS
     SELECT b.*, zscore(floor(extract(epoch from date_trunc(&#39;month&#39;, age(c.date_of_birth))) / 3600 / 24 / 30, c.gender, r.property1ForZscore, r.property2ForZscore, r.property3ForZscore) as zscore
     FROM myapp_childbmicheck b
     LEFT JOIN myapp_child c ON c.id = b.child_id
     LEFT JOIN myapp_referencezscore r ON r.sex = c.gender AND r.age = floor(extract(epoch from date_trunc(&#39;month&#39;, age(c.date_of_birth))) / 3600 / 24 / 30)
     WITH DATA;

Then in you myapp/models.py, create corresponding Model

    class ChildBmiCheckZscore(models.Model):
        child = models.ForeignKey(Child, models.DO_NOTHING)
        height_in_meters = models.DecimalField(max_digits=40, decimal_places=10, null=True)
        weight_in_kgs = models.DecimalField(max_digits=40, decimal_places=10, null=True)
        child_was_present = models.BooleanField(default=True)
        zscore = models.DecimalField(max_digits=40, decimal_places=10, null=True)
        class Meta:
            managed = False
            db_table = &#39;myapp_childbmicheck_zscore_mat_view&#39;

Then in your views

     queryset = ChildBmiCheckZscore.objects.all()

Please note you will need to refresh the materialized view manually, like

    REFRESH MATERIALIZED VIEW myapp_childbmicheck_zscore_matview;

You need to do this **NOT** annually, but at least monthly, even daily. One child&#39;s zscore may change once a year, but you have multiple children with birthdays on different months. 

**Method 3. Add A Zscore Field To ChildBmiCheck**

This one is quite straight forward. Calculate zscore each time a `ChildBmiCheck` instance is created.

</details>



huangapple
  • 本文由 发表于 2023年6月6日 06:00:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76410259.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定