英文:
How to rename rows that have the same value in a column using Pandas?
问题
Sure, here's the translated code part:
我有以下用于贷款申请的数据框:
`df = pd.read_csv("C:\\Users\\User\\Documents\\exampleLoan.csv")`
报价ID 活动 申请ID
0 Offer_4108 O_Create Offer App_866651
1 Offer_4108 O_Sent App_866651
2 Offer_4108 O_Cancelled App_866651
3 Offer_7743 O_Create Offer App_691650
4 Offer_7743 O_Created App_691650
5 Offer_7743 O_Cancelled App_691650
6 Offer_8595 O_Create Offer App_691650
7 Offer_8595 O_Sent App_691650
8 Offer_8595 O_Cancelled App_691650
9 Offer_8731 O_Create Offer App_691650
10 Offer_8731 O_Sent App_691650
11 Offer_8731 O_Cancelled App_691650
12 Offer_9517 O_Create Offer App_957884
13 Offer_9517 O_Returned App_957884
14 Offer_9517 O_Refused App_957884
15 Offer_9363 O_Create Offer App_957884
16 Offer_9363 O_Returned App_957884
17 Offer_9363 O_Accepted App_957884
每个贷款申请都有一个申请ID。每个申请可能有多个报价(App_957884有多个报价 > Offer_9517和Offer_9363),每个报价必须经历一系列的活动,从O_Create Offer开始。
我想通过为具有相同申请ID但不同报价ID的每个活动重命名O_Create Offer活动,并为其添加编号(O_Create Offer1; O_Create Offer2)。
预期结果:
```python
报价ID 活动 申请ID
0 Offer_4108 O_Create Offer 1 App_866651
1 Offer_4108 O_Sent App_866651
2 Offer_4108 O_Cancelled App_866651
3 Offer_7743 O_Create Offer 1 App_691650
4 Offer_7743 O_Created App_691650
5 Offer_7743 O_Cancelled App_691650
6 Offer_8595 O_ Create Offer 2 App_691650
7 Offer_8595 O_Sent App_691650
8 Offer_8595 O_Cancelled App_691650
9 Offer_8731 O_Create Offer 3 App_691650
10 Offer_8731 O_Sent App_691650
11 Offer_8731 O_Cancelled App_691650
12 Offer_9517 O_Create Offer 1 App_957884
13 Offer_9517 O_Returned App_957884
14 Offer_9517 O_Refused App_957884
15 Offer_9363 O_Create Offer 2 App_957884
16 Offer_9363 O_Returned App_957884
17 Offer_9363 O_Accepted App_957884
我是Python初学者,请帮助。谢谢!
Please let me know if you need any further assistance with this code or any other questions you may have.
<details>
<summary>英文:</summary>
I have these following dataFrame for a Loan Application:
`df = pd.read_csv("C:\\Users\\User\\Documents\\exampleLoan.csv")`
Offer ID Activity Application ID
0 Offer_4108 O_Create Offer App_866651
1 Offer_4108 O_Sent App_866651
2 Offer_4108 O_Cancelled App_866651
3 Offer_7743 O_Create Offer App_691650
4 Offer_7743 O_Created App_691650
5 Offer_7743 O_Cancelled App_691650
6 Offer_8595 O_Create Offer App_691650
7 Offer_8595 O_Sent App_691650
8 Offer_8595 O_Cancelled App_691650
9 Offer_8731 O_Create Offer App_691650
10 Offer_8731 O_Sent App_691650
11 Offer_8731 O_Cancelled App_691650
12 Offer_9517 O_Create Offer App_957884
13 Offer_9517 O_Returned App_957884
14 Offer_9517 O_Refused App_957884
15 Offer_9363 O_Create Offer App_957884
16 Offer_9363 O_Returned App_957884
17 Offer_9363 O_Accepted App_957884
Every loan application submitted have an ApplicationID. Each Application may have several offers (App_957884 has several offers > Offer_9517 and Offer_9363), each offer must go through a sequence of activities starting with O_Create Offer.
I want to rename O_Create Offer Activity by adding number for each activity with the same ApplicationID but different Offer ID (O_Create Offer1; O_Create Offer2).
Expected Outcome:
Offer ID Activity Application ID
0 Offer_4108 O_Create Offer 1 App_866651
1 Offer_4108 O_Sent App_866651
2 Offer_4108 O_Cancelled App_866651
3 Offer_7743 O_Create Offer 1 App_691650
4 Offer_7743 O_Created App_691650
5 Offer_7743 O_Cancelled App_691650
6 Offer_8595 O_ Create Offer 2 App_691650
7 Offer_8595 O_Sent App_691650
8 Offer_8595 O_Cancelled App_691650
9 Offer_8731 O_Create Offer 3 App_691650
10 Offer_8731 O_Sent App_691650
11 Offer_8731 O_Cancelled App_691650
12 Offer_9517 O_Create Offer 1 App_957884
13 Offer_9517 O_Returned App_957884
14 Offer_9517 O_Refused App_957884
15 Offer_9363 O_Create Offer 2 App_957884
16 Offer_9363 O_Returned App_957884
17 Offer_9363 O_Accepted App_957884
I'm a beginner with Python, please help. Thank you!
</details>
# 答案1
**得分**: 1
You can use boolean indexing to filter rows (keep only Offer rows) then use [`groupby_cumcount`][1] to add the expected number:
```python
m = df['Activity'].eq('O_Create Offer')
num = ' ' + df[m].groupby('Application ID').cumcount().add(1).astype(str)
df.loc[m, 'Activity'] += num
Output:
>>> df
Offer ID Activity Application ID
0 Offer_4108 O_Create Offer 1 App_866651
1 Offer_4108 O_Sent App_866651
2 Offer_4108 O_Cancelled App_866651
3 Offer_7743 O_Create Offer 1 App_691650
4 Offer_7743 O_Created App_691650
5 Offer_7743 O_Cancelled App_691650
6 Offer_8595 O_Create Offer 2 App_691650
7 Offer_8595 O_Sent App_691650
8 Offer_8595 O_Cancelled App_691650
9 Offer_8731 O_Create Offer 3 App_691650
10 Offer_8731 O_Sent App_691650
11 Offer_8731 O_Cancelled App_691650
12 Offer_9517 O_Create Offer 1 App_957884
13 Offer_9517 O_Returned App_957884
14 Offer_9517 O_Refused App_957884
15 Offer_9363 O_Create Offer 2 App_957884
16 Offer_9363 O_Returned App_957884
17 Offer_9363 O_Accepted App_957884
英文:
You can use boolean indexing to filter rows (keep only Offer rows) then use groupby_cumcount
to add the expected number:
m = df['Activity'].eq('O_Create Offer')
num = ' ' + df[m].groupby('Application ID').cumcount().add(1).astype(str)
df.loc[m, 'Activity'] += num
Output:
>>> df
Offer ID Activity Application ID
0 Offer_4108 O_Create Offer 1 App_866651
1 Offer_4108 O_Sent App_866651
2 Offer_4108 O_Cancelled App_866651
3 Offer_7743 O_Create Offer 1 App_691650
4 Offer_7743 O_Created App_691650
5 Offer_7743 O_Cancelled App_691650
6 Offer_8595 O_Create Offer 2 App_691650
7 Offer_8595 O_Sent App_691650
8 Offer_8595 O_Cancelled App_691650
9 Offer_8731 O_Create Offer 3 App_691650
10 Offer_8731 O_Sent App_691650
11 Offer_8731 O_Cancelled App_691650
12 Offer_9517 O_Create Offer 1 App_957884
13 Offer_9517 O_Returned App_957884
14 Offer_9517 O_Refused App_957884
15 Offer_9363 O_Create Offer 2 App_957884
16 Offer_9363 O_Returned App_957884
17 Offer_9363 O_Accepted App_957884
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论