如何在pandas中基于链式操作分配新列

huangapple go评论54阅读模式
英文:

How to assign new columns based on chaining in pandas

问题

我正在尝试在pandas中使用链式操作创建一个新的数据框。

result = (
    df.drop(['Day_of_year', 'Month', 'Week_of_year'], axis='columns'),
    pd.to_datetime(df['timestamp']),
    .assign("random" = 0)
)

# 访问元组的第一个元素
updated_df = result[0]
updated_df

如果我注释掉最后一行元组中的代码可以正常工作但我想要分配新的列

我该如何做到这一点
英文:

I'm trying to create a new dataframe using chaining in pandas.

result = (
    df.drop(['Day_of_year', 'Month', 'Week_of_year'], axis='columns'),
    pd.to_datetime(df['timestamp']),
    .assign("random" = 0)
)


# Access the first element of the tuple
updated_df = result[0]
updated_df

If I comment out the last line the code in the tuple work but I want to assign new columns.

How do I do this?

答案1

得分: 1

尝试这个:

updated_df = (
    df.drop(columns=['Day_of_year', 'Month', 'Week_of_year'])
    .assign(**{"random": 0, 'timestamp': pd.to_datetime(df['timestamp'])})
)

测试

在一个虚拟数据框上评估上述代码:

import pandas as pd

# 创建一个日期范围
date_range = pd.date_range(start='2023-07-01', end='2023-07-10')

# 创建数据框
df = pd.DataFrame()
df['timestamp'] = date_range
df['Day_of_year'] = df['timestamp'].dt.dayofyear
df['Month'] = df['timestamp'].dt.month
df['Week_of_year'] = df['timestamp'].dt.isocalendar().week

updated_df = (
    df.drop(columns=['Day_of_year', 'Month', 'Week_of_year'])
    .assign(**{"random": 0, 'timestamp': pd.to_datetime(df['timestamp'])})
)
print(updated_df)
# 打印:
#
#    timestamp  random
# 0 2023-07-01       0
# 1 2023-07-02       0
# 2 2023-07-03       0
# 3 2023-07-04       0
# 4 2023-07-05       0
# 5 2023-07-06       0
# 6 2023-07-07       0
# 7 2023-07-08       0
# 8 2023-07-09       0
# 9 2023-07-10       0
英文:

Try this:

updated_df = (
    df.drop(columns=['Day_of_year', 'Month', 'Week_of_year'])
    .assign(**{"random": 0, 'timestamp': pd.to_datetime(df['timestamp'])})
)

Testing

Evaluating the above code on a dummy dataframe:

import pandas as pd

# Create a date range
date_range = pd.date_range(start='2023-07-01', end='2023-07-10')

# Create the DataFrame
df = pd.DataFrame()
df['timestamp'] = date_range
df['Day_of_year'] = df['timestamp'].dt.dayofyear
df['Month'] = df['timestamp'].dt.month
df['Week_of_year'] = df['timestamp'].dt.isocalendar().week

updated_df = (
    df.drop(columns=['Day_of_year', 'Month', 'Week_of_year'])
    .assign(**{"random": 0, 'timestamp': pd.to_datetime(df['timestamp'])})
)
print(updated_df)
# Prints:
#
#    timestamp  random
# 0 2023-07-01       0
# 1 2023-07-02       0
# 2 2023-07-03       0
# 3 2023-07-04       0
# 4 2023-07-05       0
# 5 2023-07-06       0
# 6 2023-07-07       0
# 7 2023-07-08       0
# 8 2023-07-09       0
# 9 2023-07-10       0

huangapple
  • 本文由 发表于 2023年7月7日 01:04:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76631090.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定