2023年4月17日 08:43:27go评论58阅读模式

英文:

How can I manipulate this pandas dataframe with time series data in order to be more easier to use?

问题

I understand your request. Here's the translated code portion for manipulating the pandas DataFrame:

# 假设你的数据框名字是 df
# 使用melt函数将时间列转换为行
melted_df = df.melt(id_vars=['Customer', 'Item', 'Date'], var_name='Time', value_name='Value')

# 将时间列和日期列合并
melted_df['DateTime'] = melted_df['Date'] + ' ' + melted_df['Time']

# 删除不再需要的列
melted_df.drop(columns=['Date', 'Time'], inplace=True)

# 使用pivot函数重新排列数据框
final_df = melted_df.pivot(index=['Customer', 'DateTime'], columns='Item', values='Value').reset_index()
final_df.columns.name = None  # 删除列名

This code snippet should help you transform your DataFrame as desired.

英文:

I have a pandas dataframe with time series data, where the columns are looking like this:

Customer	Item	Date	00:30	01:00	...	23:30
XYZ	A	2020-01-01	1	2	...	3
XYZ	B	2020-01-02	2	2	...	5
ABC	A	2020-01-01	1	5	...	3
ABC	B	2020-01-02	2	2	...	1

So the hours are in the columns, instead of the rows. I want to manipulate this dataframe, concatenate the time columns into the date column, and make them a separate rows, like this:

Customer	Date	Item A	Item B
XYZ	2020-01-01 00:00	1	2
XYZ	2020-01-01 00:30	1	2
XYZ	2020-01-01 01:00	1	2
XYZ	2020-01-02 00:00	1	2
XYZ	2020-01-02 00:30	1	2
XYZ	2020-01-02 01:00	1	2
ABC	2020-01-01 00:00	2	3
ABC	2020-01-01 00:30	2	2
ABC	2020-01-01 01:00	4	2
ABC	2020-01-02 00:00	2	3
ABC	2020-01-02 00:30	2	2
ABC	2020-01-02 01:00	4	2

How can I do this? I tried a method using cross join, but that is very uneffective, because I have a lot of rows. (~100000)

答案1

得分: 1

以下是要翻译的内容：

你可以尝试以下操作（其中 df 是你的数据框架）：

df["Date"] = pd.to_datetime(df["Date"])
df = (
    df.rename(columns={"Item": "物品"})
    .melt(id_vars=["Customer", "物品", "Date"], var_name="时间", value_name="物品")
    .assign(Date=lambda df: df["Date"] + pd.to_timedelta(df["时间"] + ":00"))
    .drop(columns="时间")
    .pivot(index=["Customer", "Date"], columns="物品")
    .reset_index()
)
df.columns = [a if not b else f"{a} {b}" for a, b in df.columns]

英文:

You could try the following (with df your dataframe):

df[&quot;Date&quot;] = pd.to_datetime(df[&quot;Date&quot;])
df = (
    df.rename(columns={&quot;Item&quot;: &quot;Items&quot;})
    .melt(id_vars=[&quot;Customer&quot;, &quot;Items&quot;, &quot;Date&quot;], var_name=&quot;Time&quot;, value_name=&quot;Item&quot;)
    .assign(Date=lambda df: df[&quot;Date&quot;] + pd.to_timedelta(df[&quot;Time&quot;] + &quot;:00&quot;))
    .drop(columns=&quot;Time&quot;)
    .pivot(index=[&quot;Customer&quot;, &quot;Date&quot;], columns=&quot;Items&quot;)
    .reset_index()
)
df.columns = [a if not b else f&quot;{a} {b}&quot; for a, b in df.columns]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何操作这个带有时间序列数据的Pandas数据框，以使其更容易使用？

问题

答案1

dataframe replace() 在函数内部不起作用。

打印数据框中每个唯一值的值，在for循环中。

找出一个方法的返回，该方法在某些条件下返回空哈希。

Converting hex strings to decimal format: Why am I getting different results in JavaScript and Python?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论