英文:
How to aggregate over multiple fields in Django?
问题
我有一个名为Trade
的Django模型,用于存储多个市场的交易信息。这些对象每5分钟时间戳一次,我需要按市场
和日期时间
进行聚合。
在性能方面,最佳解决方案是什么?
如您所见,我可以提取所需时间戳的列表,进行迭代和数据聚合,但我担心这不是最有效的解决方案。
这是我的代码:
from django.db.models import Sum, Avg
# 按时间排序对象
qs = Trade.objects.all().order_by("-datetime")
# 提取唯一时间戳
dts = qs.values_list("datetime", flat=True).distinct()
for dt in dts:
cum_a = qs.filter(datetime=dt).aggregate(num_a=Sum('amount'))['num_a']
cum_t = qs.filter(datetime=dt).aggregate(num_t=Sum('trades'))['num_t']
avg_p = qs.filter(datetime=dt).aggregate(avg_p=Avg('price'))['avg_p']
# 存储聚合数据
希望这对您有所帮助。
英文:
I have a Dango model Trade
that store trades information for several markets. The objects are timestamped at every 5 minutes and I need to aggregate them by market
and by datetime
.
What is the best solution to do that with performance in mind ?
As you can see below, I could extract a list of the desired timestamps, iterate and aggregate data but I'm afraid it's not the most efficient solution.
class Trade(TimestampedModel):
market = models.ForeignKey(Market, on_delete=models.CASCADE, null=True)
datetime = models.DateTimeField(null=True)
amount = models.FloatField(null=True)
price = models.FloatField(null=True)
trades = models.FloatField(null=True)
This is my code:
from django.db.models import Sum, Avg
# Time order object
qs = Trade.objects.all().order_by("-datetime")
# Extract unique timestamps
dts = qs.values_list("datetime", flat=True).distinct()
for dt in dts:
cum_a = qs.filter(datetime=dt).aggregate(num_a=Sum('amount'))['num_a']
cum_t = qs.filter(datetime=dt).aggregate(num_t=Sum('trades'))['num_t']
avg_p = qs.filter(datetime=dt).aggregate(avg_p=Avg('price'))['avg_p']
....
# Store aggregated data
答案1
得分: 1
从 django.db.models 中导入 Sum, Avg
qs = Trade.objects.values('datetime').order_by('datetime').annotate(
cum_a=Sum('amount'),
cum_t=Sum('trades'),
avg_p=Avg('price')
)
你可以这样迭代:
for row in qs:
print(row["datetime"], row["cum_a"], row["cum_t"], row["avg_p"])
英文:
Try this:)
from django.db.models import Sum, Avg
qs = Trade.objects.values('datetime').order_by('datetime').annotate(
cum_a=Sum('amount'),
cum_t=Sum('trades'),
avg_p=Avg('price')
)
You iterate like that:
for row in qs:
print(row["datetime"], row["cum_a"], row["cum_t"], row["avg_p"])
On SQL side you are grouping by datetime
and running aggregation for each group.
SELECT "datetime", SUM("amount") AS "cum_a", SUM("trades") AS "cum_t", AVG("price") AS "avg_p"
FROM "Trade"
GROUP BY "datetime"
ORDER BY "datetime"
if you want to group by market too:
qs = Trade.objects.values('datetime', 'market').order_by('datetime', 'market') \
.annotate(
cum_a=Sum('amount'),
cum_t=Sum('trades'),
avg_p=Avg('price')
)
for row in qs:
print(row["datetime"], row["market"], row["cum_a"], row["cum_t"], row["avg_p"])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论