如何从 pandas DataFrame 中的每一行中删除字符串的一部分?

huangapple go评论67阅读模式
英文:

How to remove some part of a string from every row in pandas DataFrame?

问题

我想在每一行中将message列中的timestamp1678081731027部分一次性删除。

我尝试了以下方法:

df['message'] = df['message'].str.replace('timestamp1678081731027', '', regex=True)
英文:

I have a pandas dataframe with one of the columns as follows:
log_df['message']

0      timestamp1678081731027formatversion1webaclidar...
1      timestamp1678081732769formatversion1webaclidar...
2      timestamp1678081732812formatversion1webaclidar...
3      timestamp1678081732890formatversion1webaclidar...
4      timestamp1678081736029formatversion1webaclidar...

358    timestamp1678082486777formatversion1webaclidar...
359    timestamp1678082487818formatversion1webaclidar...
360    timestamp1678082488038formatversion1webaclidar...
361    timestamp1678082490070formatversion1webaclidar...
362    timestamp1678082490070formatversion1webaclidar...

I want timestamp1678081731027 part in every row to be removed from this column at once

I tried
df['message'] = df['message'].str.replace('timestamp1678081731027', '', regex=True)

答案1

得分: 2

如果您的字符串始终具有固定长度,例如timestamp1678081731027(22个字符),您可以简单地使用str[22:]来切割消息:

英文:

If you have always a fixed length like timestamp1678081731027 (22 characters), you can simply use str[22:] to slice the message:

df['message2'] = df['message'].str[22:]
print(df)

# Output
                                               message                     message2
0    timestamp1678081731027formatversion1webaclidar...  formatversion1webaclidar...
1    timestamp1678081732769formatversion1webaclidar...  formatversion1webaclidar...
2    timestamp1678081732812formatversion1webaclidar...  formatversion1webaclidar...
3    timestamp1678081732890formatversion1webaclidar...  formatversion1webaclidar...
4    timestamp1678081736029formatversion1webaclidar...  formatversion1webaclidar...
358  timestamp1678082486777formatversion1webaclidar...  formatversion1webaclidar...
359  timestamp1678082487818formatversion1webaclidar...  formatversion1webaclidar...
360  timestamp1678082488038formatversion1webaclidar...  formatversion1webaclidar...
361  timestamp1678082490070formatversion1webaclidar...  formatversion1webaclidar...
362  timestamp1678082490070formatversion1webaclidar...  formatversion1webaclidar...

答案2

得分: 1

IIUC用于删除timestamp后面的数字:

df['message'] = df['message'].str.replace(r'timestamp\d+', '', regex=True)
英文:

IIUC use for remove timestamp with numbers after it:

df['message'] = df['message'].str.replace(r'timestamp\d+', '', regex=True)

huangapple
  • 本文由 发表于 2023年3月7日 14:23:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/75658608.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定