将多个数据框行合并为1列。

huangapple go评论69阅读模式
英文:

Merge multiple dataframe rows into 1 column

问题

当我收集时间数据时,日期和时间位于不同的行上。我想要将日期和时间合并成一个包含日期和时间的datetime列。

这是我获取的数据示例:

sampletest = pd.DataFrame(columns=['daytime','datetime'])

sampletest = sampletest.append({'daytime':'Sunday, January 1'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'01:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'13:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'17:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'19:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'Monday, January 2'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'08:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'09:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'10:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'11:30'}, ignore_index=True)

这是最终结果的示例:

ConvertTime = pd.DataFrame(columns=['daytime','datetime'])
ConvertTime = ConvertTime.append({'daytime':'Sunday, January 1','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'01:00','datetime': '2023-01-01 01:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'13:00','datetime': '2023-01-01 13:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'17:30','datetime': '2023-01-01 17:30:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'19:00','datetime': '2023-01-01 19:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'Monday, January 2','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'08:00','datetime': '2023-01-02 08:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'09:00','datetime': '2023-01-02 09:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'10:30','datetime': '2023-01-02 10:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'11:30','datetime': '2023-01-02 11:00:00'}, ignore_index=True)

你可以使用以下脚本来合并这些数据。由于数据没有年份信息,我们假设所有日期都是2023年:

import pandas as pd
import numpy as np

# 创建一个新的DataFrame来存储结果
ConvertTime = pd.DataFrame(columns=['daytime', 'datetime'])

current_date = None  # 用于存储当前日期

for index, row in sampletest.iterrows():
    if ',' in row['daytime']:
        current_date = row['daytime']
    else:
        if current_date:
            # 合并日期和时间,并添加到结果DataFrame中
            datetime_str = f'2023-{current_date}, {row["daytime"]}:00'
            ConvertTime = ConvertTime.append({'daytime': current_date, 'datetime': datetime_str}, ignore_index=True)

# 输出合并后的结果
print(ConvertTime)

这将生成一个包含日期和时间的新DataFrame,日期和时间已合并为datetime列,年份假设为2023年。

英文:

When I collect time data, the day and time are on different rows. I like to combine the day and time to form a datetime column with both date and time.

This is a sample of the data I get

sampletest = pd.DataFrame(columns=['daytime','datetime'])

sampletest = sampletest.append({'daytime':'Sunday, January 1'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'01:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'13:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'17:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'19:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'Monday, January 2'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'08:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'09:00'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'10:30'}, ignore_index=True)
sampletest = sampletest.append({'daytime':'11:30'}, ignore_index=True)

This is a sample of what the end result should look like.

ConvertTime = pd.DataFrame(columns=['daytime','datetime'])
ConvertTime = ConvertTime.append({'daytime':'Sunday, January 1','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'01:00','datetime': '2023-01-01 01:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'13:00','datetime': '2023-01-01 13:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'17:30','datetime': '2023-01-01 17:30:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'19:00','datetime': '2023-01-01 19:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'Monday, January 2','datetime': np.NaN}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'08:00','datetime': '2023-01-02 08:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'09:00','datetime': '2023-01-02 09:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'10:30','datetime': '2023-01-02 10:00:00'}, ignore_index=True)
ConvertTime = ConvertTime.append({'daytime':'11:30','datetime': '2023-01-02 11:00:00'}, ignore_index=True)

How should I script this merger? Also, the data does not have year so, we will assume all days are 2023.

答案1

得分: 1

以下是翻译好的内容:

你可以尝试以下代码:

m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
    pd.to_datetime((sampletest['daytime'] + ' 2023').where(~m))
    .ffill()
    .add(pd.to_timedelta((sampletest['daytime'] + ':00').where(m)))
    .dt.strftime('%Y-%m-%d %H:%M:%S')  # 如果你需要字符串格式的话
)

或者,稍微简化一点:

m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
    pd.to_datetime(
        (sampletest['daytime'] + ' 2023 ').where(~m).ffill()
        + sampletest['daytime'].where(m)
    )
    .dt.strftime('%Y-%m-%d %H:%M:%S')  # 如果你需要字符串格式的话
)

示例的结果:

             daytime             datetime
0  Sunday, January 1                  NaN
1              01:00  2023-01-01 01:00:00
2              13:00  2023-01-01 13:00:00
3              17:30  2023-01-01 17:30:00
4              19:00  2023-01-01 19:00:00
5  Monday, January 2                  NaN
6              08:00  2023-01-02 08:00:00
7              09:00  2023-01-02 09:00:00
8              10:30  2023-01-02 10:30:00
9              11:30  2023-01-02 11:30:00

PS:不要使用.append,它已经过时,不再适用于当前的Pandas版本。

英文:

You could try the following:

m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
    pd.to_datetime((sampletest['daytime'] + ' 2023').where(~m))
    .ffill()
    .add(pd.to_timedelta((sampletest['daytime'] + ':00').where(m)))
    .dt.strftime('%Y-%m-%d %H:%M:%S')  # In case you want strings
)

Or, a bit simpler:

m = sampletest['daytime'].str.match('\d{2}:\d{2}')
sampletest['datetime'] = (
    pd.to_datetime(
        (sampletest['daytime'] + ' 2023 ').where(~m).ffill()
        + sampletest['daytime'].where(m)
    )
    .dt.strftime('%Y-%m-%d %H:%M:%S')  # In case you want strings
)

Result for the sample:

             daytime             datetime
0  Sunday, January 1                  NaN
1              01:00  2023-01-01 01:00:00
2              13:00  2023-01-01 13:00:00
3              17:30  2023-01-01 17:30:00
4              19:00  2023-01-01 19:00:00
5  Monday, January 2                  NaN
6              08:00  2023-01-02 08:00:00
7              09:00  2023-01-02 09:00:00
8              10:30  2023-01-02 10:30:00
9              11:30  2023-01-02 11:30:00

PS: Don't use .append. It is depreciated and doesn't work anymore with the current Pandas version.

答案2

得分: 0

将日期和时间行合并到单一列中

df['datetime'] = df.apply(lambda row: pd.to_datetime(row['date'] + ' ' + row['time']), axis=1)
英文:

Merge date and time rows into a single column

df['datetime'] = df.apply(lambda row: pd.to_datetime(row['date'] + ' ' + row['time']), axis=1)

huangapple
  • 本文由 发表于 2023年6月12日 07:39:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76452923.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定