如何使用Pandas重命名具有相同值的列中的行?

huangapple go评论59阅读模式
英文:

How to rename rows that have the same value in a column using Pandas?

问题

Sure, here's the translated code part:

我有以下用于贷款申请的数据框

`df = pd.read_csv("C:\\Users\\User\\Documents\\exampleLoan.csv")`

           报价ID          活动       申请ID
     0   Offer_4108    O_Create Offer           App_866651
     1   Offer_4108            O_Sent           App_866651
     2   Offer_4108       O_Cancelled           App_866651
     3   Offer_7743    O_Create Offer           App_691650
     4   Offer_7743         O_Created           App_691650
     5   Offer_7743       O_Cancelled           App_691650
     6   Offer_8595    O_Create Offer           App_691650
     7   Offer_8595            O_Sent           App_691650
     8   Offer_8595       O_Cancelled           App_691650
     9   Offer_8731    O_Create Offer           App_691650
    10   Offer_8731            O_Sent           App_691650
    11   Offer_8731       O_Cancelled           App_691650
    12   Offer_9517    O_Create Offer           App_957884
    13   Offer_9517        O_Returned           App_957884
    14   Offer_9517         O_Refused           App_957884
    15   Offer_9363    O_Create Offer           App_957884
    16   Offer_9363        O_Returned           App_957884
    17   Offer_9363        O_Accepted           App_957884

每个贷款申请都有一个申请ID每个申请可能有多个报价App_957884有多个报价 > Offer_9517和Offer_9363),每个报价必须经历一系列的活动从O_Create Offer开始

我想通过为具有相同申请ID但不同报价ID的每个活动重命名O_Create Offer活动并为其添加编号O_Create Offer1; O_Create Offer2)。

预期结果

```python
           报价ID            活动       申请ID
     0   Offer_4108    O_Create Offer 1           App_866651
     1   Offer_4108            O_Sent           App_866651
     2   Offer_4108         O_Cancelled           App_866651
     3   Offer_7743    O_Create Offer 1           App_691650
     4   Offer_7743           O_Created           App_691650
     5   Offer_7743         O_Cancelled           App_691650
     6   Offer_8595   O_ Create Offer 2           App_691650
     7   Offer_8595              O_Sent           App_691650
     8   Offer_8595         O_Cancelled           App_691650
     9   Offer_8731    O_Create Offer 3           App_691650
    10   Offer_8731              O_Sent           App_691650
    11   Offer_8731         O_Cancelled           App_691650
    12   Offer_9517    O_Create Offer 1           App_957884
    13   Offer_9517          O_Returned           App_957884
    14   Offer_9517           O_Refused           App_957884
    15   Offer_9363    O_Create Offer 2           App_957884
    16   Offer_9363          O_Returned           App_957884
    17   Offer_9363          O_Accepted           App_957884

我是Python初学者,请帮助。谢谢!


Please let me know if you need any further assistance with this code or any other questions you may have.

<details>
<summary>英文:</summary>

I have these following dataFrame for a Loan Application:

`df = pd.read_csv(&quot;C:\\Users\\User\\Documents\\exampleLoan.csv&quot;)`

           Offer ID          Activity       Application ID
     0   Offer_4108    O_Create Offer           App_866651
     1   Offer_4108            O_Sent           App_866651
     2   Offer_4108       O_Cancelled           App_866651
     3   Offer_7743    O_Create Offer           App_691650
     4   Offer_7743         O_Created           App_691650
     5   Offer_7743       O_Cancelled           App_691650
     6   Offer_8595    O_Create Offer           App_691650
     7   Offer_8595            O_Sent           App_691650
     8   Offer_8595       O_Cancelled           App_691650
     9   Offer_8731    O_Create Offer           App_691650
    10   Offer_8731            O_Sent           App_691650
    11   Offer_8731       O_Cancelled           App_691650
    12   Offer_9517    O_Create Offer           App_957884
    13   Offer_9517        O_Returned           App_957884
    14   Offer_9517         O_Refused           App_957884
    15   Offer_9363    O_Create Offer           App_957884
    16   Offer_9363        O_Returned           App_957884
    17   Offer_9363        O_Accepted           App_957884

Every loan application submitted have an ApplicationID. Each Application may have several offers (App_957884 has several offers &gt; Offer_9517 and Offer_9363), each offer must go through a sequence of activities starting with O_Create Offer.

I want to rename O_Create Offer Activity by adding number for each activity with the same ApplicationID but different Offer ID (O_Create Offer1; O_Create Offer2). 

Expected Outcome:

       Offer ID            Activity       Application ID
 0   Offer_4108    O_Create Offer 1           App_866651
 1   Offer_4108              O_Sent           App_866651
 2   Offer_4108         O_Cancelled           App_866651
 3   Offer_7743    O_Create Offer 1           App_691650
 4   Offer_7743           O_Created           App_691650
 5   Offer_7743         O_Cancelled           App_691650
 6   Offer_8595   O_ Create Offer 2           App_691650
 7   Offer_8595              O_Sent           App_691650
 8   Offer_8595         O_Cancelled           App_691650
 9   Offer_8731    O_Create Offer 3           App_691650
10   Offer_8731              O_Sent           App_691650
11   Offer_8731         O_Cancelled           App_691650
12   Offer_9517    O_Create Offer 1           App_957884
13   Offer_9517          O_Returned           App_957884
14   Offer_9517           O_Refused           App_957884
15   Offer_9363    O_Create Offer 2           App_957884
16   Offer_9363          O_Returned           App_957884
17   Offer_9363          O_Accepted           App_957884


I&#39;m a beginner with Python, please help. Thank you!




</details>


# 答案1
**得分**: 1

You can use boolean indexing to filter rows (keep only Offer rows) then use [`groupby_cumcount`][1] to add the expected number:

```python
m = df['Activity'].eq('O_Create Offer')
num = ' ' + df[m].groupby('Application ID').cumcount().add(1).astype(str)
df.loc[m, 'Activity'] += num

Output:

>>> df
      Offer ID          Activity Application ID
0   Offer_4108  O_Create Offer 1     App_866651
1   Offer_4108            O_Sent     App_866651
2   Offer_4108       O_Cancelled     App_866651
3   Offer_7743  O_Create Offer 1     App_691650
4   Offer_7743         O_Created     App_691650
5   Offer_7743       O_Cancelled     App_691650
6   Offer_8595  O_Create Offer 2     App_691650
7   Offer_8595            O_Sent     App_691650
8   Offer_8595       O_Cancelled     App_691650
9   Offer_8731  O_Create Offer 3     App_691650
10  Offer_8731            O_Sent     App_691650
11  Offer_8731       O_Cancelled     App_691650
12  Offer_9517  O_Create Offer 1     App_957884
13  Offer_9517        O_Returned     App_957884
14  Offer_9517         O_Refused     App_957884
15  Offer_9363  O_Create Offer 2     App_957884
16  Offer_9363        O_Returned     App_957884
17  Offer_9363        O_Accepted     App_957884
英文:

You can use boolean indexing to filter rows (keep only Offer rows) then use groupby_cumcount to add the expected number:

m = df[&#39;Activity&#39;].eq(&#39;O_Create Offer&#39;)
num = &#39; &#39; + df[m].groupby(&#39;Application ID&#39;).cumcount().add(1).astype(str)
df.loc[m, &#39;Activity&#39;] += num

Output:

&gt;&gt;&gt; df
      Offer ID          Activity Application ID
0   Offer_4108  O_Create Offer 1     App_866651
1   Offer_4108            O_Sent     App_866651
2   Offer_4108       O_Cancelled     App_866651
3   Offer_7743  O_Create Offer 1     App_691650
4   Offer_7743         O_Created     App_691650
5   Offer_7743       O_Cancelled     App_691650
6   Offer_8595  O_Create Offer 2     App_691650
7   Offer_8595            O_Sent     App_691650
8   Offer_8595       O_Cancelled     App_691650
9   Offer_8731  O_Create Offer 3     App_691650
10  Offer_8731            O_Sent     App_691650
11  Offer_8731       O_Cancelled     App_691650
12  Offer_9517  O_Create Offer 1     App_957884
13  Offer_9517        O_Returned     App_957884
14  Offer_9517         O_Refused     App_957884
15  Offer_9363  O_Create Offer 2     App_957884
16  Offer_9363        O_Returned     App_957884
17  Offer_9363        O_Accepted     App_957884

huangapple
  • 本文由 发表于 2023年6月13日 12:36:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/76461717.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定