如何使用连续/链接的prefetch_related() 减少 SQL 语句?

huangapple go评论77阅读模式
英文:

How can I reduce SQL statements using consecutive/chained prefetch_related()?

问题

要减少使用prefetch_related()时执行的SQL语句数量,您可以尝试以下方法:

  1. 使用select_related():在许多情况下,select_related()可以替代prefetch_related()来减少SQL查询次数。select_related()执行一次SQL查询来获取相关对象,而不是为每个关联对象执行单独的查询。这对于外键关系非常有用。

  2. 使用only()方法:only()方法允许您选择要从数据库检索的字段,而不是选择整个对象。这可以减少数据的传输量,从而提高性能。

  3. 使用values()values_list():这些方法允许您仅检索相关对象的特定字段,而不是整个对象。这可以减少数据传输和数据库查询的数量。

  4. 使用annotate()F()表达式:如果您需要在查询中进行一些聚合或计算,可以使用这些功能来减少不必要的查询。

  5. 使用缓存:在适当的情况下,可以考虑使用缓存来存储已经检索的数据,以避免不必要的数据库查询。

  6. 使用only()defer():这些方法可以帮助您选择要立即加载和延迟加载的字段,从而减少数据库查询的数量。

根据您的具体代码和需求,您可以结合使用上述方法,以减少SQL查询的数量并提高性能。记住,性能优化通常需要根据特定情况进行调整,因此您可能需要尝试不同的方法来找到最佳的性能优化策略。

英文:

Please let me ask how I can reduce the number of SQL using chained prefetch_related().

I wrote

item = Item.objects.prefetch_related('item_photo', 'item_stock', 'item_review',) # .prefetch_related('item_photo').prefetch_related('item_stock')...

item
(0.000) SELECT "item"."id", "item"."name", "item"."price", "item"."category_id", "item"."runs", "item"."wins", "item"."description", "item"."total_sales", "item"."created_at", "item"."updated_at", "item"."on_sale", "item"."note" FROM "item" LIMIT 21; args=(); alias=default
(0.000) SELECT "item_photo"."id", "item_photo"."item_id", "item_photo"."photo", "item_photo"."priority", "item_photo"."note" FROM "item_photo" WHERE "item_photo"."item_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9) ORDER BY "item_photo"."priority" ASC; args=(1, 2, 3, 4, 5, 6, 7, 8, 9); alias=default
(0.000) SELECT "item_stock"."id", "item_stock"."item_id", "item_stock"."stock", "item_stock"."note" FROM "item_stock" WHERE "item_stock"."item_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9); args=(1, 2, 3, 4, 5, 6, 7, 8, 9); alias=default
(0.000) SELECT "item_review"."id", "item_review"."item_id", "item_review"."user_id", "item_review"."stars", "item_review"."comment", "item_review"."note" FROM "item_review" WHERE "item_review"."item_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9); args=(1, 2, 3, 4, 5, 6, 7, 8, 9); alias=default
<QuerySet [<Item: Apple>, <Item: Banana>, <Item: Orange>, <Item: Lemon>, <Item: Tea>, <Item: Ginger>, <Item: Onion>, <Item: Banana>, <Item: Water>]>

but, it displayed many SQL statements even if I bound tables with prefetch_related().

In addition, if I wrote following this in views.py,

pk= 4
item = Item.objects.prefetch_related('item_photo', 'item_stock', 'item_review',).get(pk=pk)
context = {
	'form': form,
	'item': item,
	'photos': item.item_photo.values(),
	'stock': item.item_stock.values(),
	'review': item.item_review.values(),
}

more SQL were executed.

Even though I modified code using Prefetch(),

item = Item.objects.prefetch_related(
	Prefetch(
		'item_photo',
		queryset=ItemPhoto.objects.filter(),
		to_attr='ipr',
	),
	Prefetch(
		'item_stock',
		queryset=ItemStock.objects.filter(),
		to_attr='isr',
	),
	Prefetch(
		'item_review',
		queryset=ItemReview.objects.filter(),
		to_attr='irr',
	),
)

item
<QuerySet [<Item: Apple>, <Item: Banana>, <Item: Orange>, <Item: Lemon>, <Item: Tea>, <Item: Ginger>, <Item: Onion>, <Item: Banana>, <Item: Water>]>

item.get(pk=4).ipr[0].photo  # [i.ipr[0].photo for i in item.filter(pk=4)][0]
(0.000) SELECT "item"."id", "item"."name", "item"."price", "item"."category_id", "item"."runs", "item"."wins", "item"."description", "item"."total_sales", "item"."created_at", "item"."updated_at", "item"."on_sale", "item"."note" FROM "item" WHERE "item"."id" = 4 LIMIT 21; args=(4,); alias=default
(0.000) SELECT "item_photo"."id", "item_photo"."item_id", "item_photo"."photo", "item_photo"."priority", "item_photo"."note" FROM "item_photo" WHERE "item_photo"."item_id" IN (4) ORDER BY "item_photo"."priority" ASC; args=(4,); alias=default
(0.000) SELECT "item_stock"."id", "item_stock"."item_id", "item_stock"."stock", "item_stock"."note" FROM "item_stock" WHERE "item_stock"."item_id" IN (4); args=(4,); alias=default
(0.000) SELECT "item_review"."id", "item_review"."item_id", "item_review"."user_id", "item_review"."stars", "item_review"."comment", "item_review"."note" FROM "item_review" WHERE "item_review"."item_id" IN (4); args=(4,); alias=default
Out[74]: <ImageFieldFile: shopping/Lemon/4cfd8e29320d4b63b77f494a759dadf6.jpg>

Still, so many SQL were executed.

If so, it seems it is better to write a code like following this because the number of executed SQL is fewer than the above way.

context = {
'form': form,
'item': Item.objects.get(pk=pk),
'photos': ItemPhoto.objects.filter(item_id=pk),
'stock': ItemStock.objects.get(item_id=pk),
'review': ItemReview.objects.filter(item_id=pk),
}

In this case, what and how should I change my code to reduce the number of executed SQL if I bind database tables with prefetch_related()?

Any advice is appreciated.

Thanks.

◆ models.py(omitted)

class Item(models.Model): # Product
	name = models.CharField("商品名", max_length=255, blank=False, default="")
	price = models.DecimalField("価格",
		max_digits=12,
		decimal_places=2,
		default=0.00, 
		blank=False,
	) 
	on_sale = models.BooleanField("販売中", default=True)


class ItemStock(models.Model):
	item = models.ForeignKey(Item, on_delete=models.SET_NULL,
		null=True, related_name="item_stock", verbose_name="商品番号")
	stock = models.IntegerField("在庫数", blank=False, default=0)


class ItemPhoto(models.Model):
	item = models.ForeignKey(Item, on_delete=models.CASCADE,
		related_name="item_photo", verbose_name="商品番号",)
	photo = models.ImageField("商品画像", blank=True, 
		upload_to=get_itemimage_path)


class ItemReview(models.Model):
	STARS_CHOICES = (
		("1", "★☆☆☆☆"),
		("2", "★★☆☆☆"),
		("3", "★★★☆☆"),
		('4', '★★★★☆'),
		('5', '★★★★★'),
	)

	item = models.ForeignKey(Item, verbose_name="商品番号",
		on_delete=models.CASCADE, related_name="item_review",)
	stars = models.CharField("評価", max_length=1, # evaluation
		choices=STARS_CHOICES, blank=False,
	) 

Python: 3.9 / Django: 4.1

答案1

得分: 2

> 似乎最好写一个像以下这样的代码,因为执行的SQL数量少于上面的方式。

让我们来统计执行的查询:

context = {
    'form': form,
    'item': Item.objects.get(pk=pk),  # 查询1
    'photos': ItemPhoto.objects.filter(item_id=pk),  # 查询2
    'stock': ItemStock.objects.get(item_id=pk),  # 查询3
    'review': ItemReview.objects.filter(item_id=pk),  # 查询4
}

然后让我们看一下预取版本:

item = Item.objects.prefetch_related(
    'item_photo',  # 查询1
    'item_stock',  # 查询2
    'item_review',  # 查询3
).get(pk=pk)  # 查询4

两者都会执行4个查询,如果需要来自4个表的模型,确实没有其他选择而要执行4个查询。

然而,当使用 prefetch_related() 时,您应该只在预取的属性上使用 .all(),不要尝试使用 .filter(),因为Django不会在它上面“模拟”一个查询集方法。

英文:

> it seems it is better to write a code like following this because the number of executed SQL is fewer than the above way.

Let's count the queries executed:

context = {
    'form': form,
    'item': Item.objects.get(pk=pk),  # Query 1
    'photos': ItemPhoto.objects.filter(item_id=pk),  # Query 2
    'stock': ItemStock.objects.get(item_id=pk),  # Query 3
    'review': ItemReview.objects.filter(item_id=pk),  # Query 4
}

Then let's look at the prefetch version:

item = Item.objects.prefetch_related(
    'item_photo', # Query 1
    'item_stock',  # Query 2
    'item_review',  # Query 3
).get(pk=pk)  # Query 4

Both will execute 4 queries, and if you need models from 4 tables, there really isn't any other option than to do 4 queries.

However, when using prefetch_related(), you should only really use .all() on the prefetched attribute, not try to e.g. .filter() it, because Django won't "emulate" a queryset method on it.

huangapple
  • 本文由 发表于 2023年2月26日 21:34:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75572339.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定