英文:
Merge multiple dataframe rows into 1 column
问题
当我收集时间数据时,日期和时间位于不同的行上。我想要将日期和时间合并成一个包含日期和时间的datetime列。
这是我获取的数据示例:
sampletest = pd.DataFrame(columns=['daytime','datetime'])
sampletest = sampletest.append({'daytime':'Sunday, January 1'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'01:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'13:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'17:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'19:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'Monday, January 2'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'08:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'09:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'10:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'11:30'}, ignore_index=True)
这是最终结果的示例:
ConvertTime = pd.DataFrame(columns=['daytime','datetime'])
ConvertTime = ConvertTime.append({'daytime':'Sunday, January 1','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'01:00','datetime': '2023-01-01 01:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'13:00','datetime': '2023-01-01 13:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'17:30','datetime': '2023-01-01 17:30:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'19:00','datetime': '2023-01-01 19:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'Monday, January 2','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'08:00','datetime': '2023-01-02 08:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'09:00','datetime': '2023-01-02 09:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'10:30','datetime': '2023-01-02 10:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'11:30','datetime': '2023-01-02 11:00:00'}, ignore_index=True)
你可以使用以下脚本来合并这些数据。由于数据没有年份信息,我们假设所有日期都是2023年:
import pandas as pd
import numpy as np
# 创建一个新的DataFrame来存储结果
ConvertTime = pd.DataFrame(columns=['daytime', 'datetime'])
current_date = None # 用于存储当前日期
for index, row in sampletest.iterrows():
if ',' in row['daytime']:
current_date = row['daytime']
else:
if current_date:
# 合并日期和时间,并添加到结果DataFrame中
datetime_str = f'2023-{current_date}, {row["daytime"]}:00'
ConvertTime = ConvertTime.append({'daytime': current_date, 'datetime': datetime_str}, ignore_index=True)
# 输出合并后的结果
print(ConvertTime)
这将生成一个包含日期和时间的新DataFrame,日期和时间已合并为datetime列,年份假设为2023年。
英文:
When I collect time data, the day and time are on different rows. I like to combine the day and time to form a datetime column with both date and time.
This is a sample of the data I get
sampletest = pd.DataFrame(columns=['daytime','datetime'])
sampletest = sampletest.append({'daytime':'Sunday, January 1'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'01:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'13:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'17:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'19:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'Monday, January 2'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'08:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'09:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'10:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'11:30'}, ignore_index=True)
This is a sample of what the end result should look like.
ConvertTime = pd.DataFrame(columns=['daytime','datetime'])
ConvertTime = ConvertTime.append({'daytime':'Sunday, January 1','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'01:00','datetime': '2023-01-01 01:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'13:00','datetime': '2023-01-01 13:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'17:30','datetime': '2023-01-01 17:30:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'19:00','datetime': '2023-01-01 19:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'Monday, January 2','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'08:00','datetime': '2023-01-02 08:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'09:00','datetime': '2023-01-02 09:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'10:30','datetime': '2023-01-02 10:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'11:30','datetime': '2023-01-02 11:00:00'}, ignore_index=True)
How should I script this merger? Also, the data does not have year so, we will assume all days are 2023.
答案1
得分: 1
以下是翻译好的内容:
你可以尝试以下代码:
m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
pd.to_datetime((sampletest['daytime'] + ' 2023').where(~m))
.ffill()
.add(pd.to_timedelta((sampletest['daytime'] + ':00').where(m)))
.dt.strftime('%Y-%m-%d %H:%M:%S') # 如果你需要字符串格式的话
)
或者,稍微简化一点:
m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
pd.to_datetime(
(sampletest['daytime'] + ' 2023 ').where(~m).ffill()
+ sampletest['daytime'].where(m)
)
.dt.strftime('%Y-%m-%d %H:%M:%S') # 如果你需要字符串格式的话
)
示例的结果:
daytime datetime
0 Sunday, January 1 NaN
1 01:00 2023-01-01 01:00:00
2 13:00 2023-01-01 13:00:00
3 17:30 2023-01-01 17:30:00
4 19:00 2023-01-01 19:00:00
5 Monday, January 2 NaN
6 08:00 2023-01-02 08:00:00
7 09:00 2023-01-02 09:00:00
8 10:30 2023-01-02 10:30:00
9 11:30 2023-01-02 11:30:00
PS:不要使用.append
,它已经过时,不再适用于当前的Pandas版本。
英文:
You could try the following:
m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
pd.to_datetime((sampletest['daytime'] + ' 2023').where(~m))
.ffill()
.add(pd.to_timedelta((sampletest['daytime'] + ':00').where(m)))
.dt.strftime('%Y-%m-%d %H:%M:%S') # In case you want strings
)
Or, a bit simpler:
m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
pd.to_datetime(
(sampletest['daytime'] + ' 2023 ').where(~m).ffill()
+ sampletest['daytime'].where(m)
)
.dt.strftime('%Y-%m-%d %H:%M:%S') # In case you want strings
)
Result for the sample:
daytime datetime
0 Sunday, January 1 NaN
1 01:00 2023-01-01 01:00:00
2 13:00 2023-01-01 13:00:00
3 17:30 2023-01-01 17:30:00
4 19:00 2023-01-01 19:00:00
5 Monday, January 2 NaN
6 08:00 2023-01-02 08:00:00
7 09:00 2023-01-02 09:00:00
8 10:30 2023-01-02 10:30:00
9 11:30 2023-01-02 11:30:00
PS: Don't use .append
. It is depreciated and doesn't work anymore with the current Pandas version.
答案2
得分: 0
将日期和时间行合并到单一列中
df['datetime'] = df.apply(lambda row: pd.to_datetime(row['date'] + ' ' + row['time']), axis=1)
英文:
Merge date and time rows into a single column
df['datetime'] = df.apply(lambda row: pd.to_datetime(row['date'] + ' ' + row['time']), axis=1)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论