2023年5月22日 11:16:37go评论101阅读模式

英文:

Use Python to sum the data every 24 hours and filter for the data with the maximum sum

问题

I have three rows of data, representing a time series for a whole year at an hourly resolution (3,8760).
我有三行数据，表示一整年以每小时分辨率的时间序列（3,8760）。

I want to sum the values for every 24 hours/columns and filter the row with the maximum sum.
我想对每24小时/列的值进行求和，并筛选出总和最大的行。

For example:
例如：

If I want to sum the values every 2 hours/columns and filter the row with the maximum sum,
如果我想每2小时/列对值进行求和，并筛选出总和最大的行，

the expected output would be
预期的输出将会是

(1,2,4,5,14,0).
(1,2,4,5,14,0).

英文:

I have three rows of data, representing a time series for a whole year at an hourly resolution (3,8760).
I want to sum the values for every 24 hours/columns and filter the row with the maximum sum.

For example:
A = (1,2,3,4,5,6) B = (0,0,4,5,6,7) C = (0,0,2,6,14,0)
If I want to sum the values every 2 hours/columns and filter the row with the maximum sum,
the expected output would be
(1,2,4,5,14,0).

Currently, I am only trying to input the data into Python and create it in the form of a dataframe.

答案1

得分: 0

以下是您要翻译的代码部分：

import numpy as np
a = np.array([(1,2,3,4,5,6), (0,0,4,5,6,7), (0,0,2,6,14,0)])

array([[ 1,  2,  3,  4,  5,  6],
       [ 0,  0,  4,  5,  6,  7],
       [ 0,  0,  2,  6, 14,  0]])

首先，重塑数组并沿新轴求和（在简化示例中为2，完整示例中为24）：

b = a.reshape((3,3,2)).sum(axis=2)

array([[ 3,  7, 11],
       [ 0,  9, 13],
       [ 0,  8, 14]])

这给出了所有部分和。现在，您可以获取给定列中总和最大的索引：

idx = np.argmax(b, axis=0)

array([0, 1, 2], dtype=int64)

现在，您可以根据这个索引从初始数组中选择值：

a.reshape((3,3,2))[idx, range(a.shape[1]//2), :].reshape((6,))

array([ 1,  2,  4,  5, 14,  0])

这给出了您想要的答案。

最终解决方案

您可以将所有这些放在这个函数中：

def filter_row_cols(a, period = 2):
  rshape = (a.shape[0], a.shape[1]//period, period)
  idx = np.argmax(a.reshape(rshape).sum(axis=2), axis=0)
  result = a.reshape(rshape)[idx, range(a.shape[1]//period), :].reshape((a.shape[1],))
  return result

array([ 1,  2,  4,  5, 14,  0])

英文:

Taking your simplified example with shape=(6,3):

import numpy as np
a = np.array([(1,2,3,4,5,6), (0,0,4,5,6,7), (0,0,2,6,14,0)])

array([[ 1,  2,  3,  4,  5,  6],
       [ 0,  0,  4,  5,  6,  7],
       [ 0,  0,  2,  6, 14,  0]])

First reshape the array and sum along new axis (2 in simplified example, 24 in your full case):

b = a.reshape((3,3,2)).sum(axis=2)

array([[ 3,  7, 11],
       [ 0,  9, 13],
       [ 0,  8, 14]])

This gives you all partial sums. Now you can get indexes where there sum is biggest for a given column:

idx = np.argmax(b, axis=0)

array([0, 1, 2], dtype=int64)

Now you can select values from the initial array according to this index:

a.reshape((3,3,2))[idx, range(a.shape[1]//2), :].reshape((6,))

array([ 1,  2,  4,  5, 14,  0])

Which gives the answer you wanted.

Final solution

You can write it all in this routine:

def filter_row_cols(a, period = 2):
  rshape = (a.shape[0], a.shape[1]//period, period)
  idx = np.argmax(a.reshape(rshape).sum(axis=2), axis=0)
  result = a.reshape(rshape)[idx, range(a.shape[1]//period), :].reshape((a.shape[1],))
  return result

array([ 1,  2,  4,  5, 14,  0])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Use Python 每隔 24 小时对数据求和，并筛选出总和最大的数据。

问题

答案1

最终解决方案

Final solution

Murmur3哈希在Go和Python之间的兼容性

自定义函数必需参数的Python错误消息

如何在 discord.py v2 中为斜杠命令添加选项？

Python字典、列表和for循环错误

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。