英文:
How to check if adjacent row values with common group ID in a column are equal in a pandas dataframe?
问题
我有一个由7列组成的数据框,假设每个FID都是一个研究对象。在“FamilyID”中共享相同值的对象属于同一“组”。"YDDW"指示研究对象的名称。
现在我想比较相邻的两个FID的名称("YDDW"中的值)是否按日期顺序("Order_Date"中的值)相同。如果相邻的两个名称相同,则我想在新列"Classification"中为具有较早日期(由"Order_Date"定义)的对象分配值"A";如果相邻的名称不同,则在新列中分配"B"。
以下是数据框的快照。属于组006的两个FID 1506、3388在"YDDW"中有相同的名称,然后将在"Classification"列中为3388的行分配"A";组027中的两个对象40、2369在"YDDW"中有不同的名称,然后将在"Classification"列中为2369的行分配"B"。
我该如何实现这些?提前感谢!
英文:
I have a dataframe consisting of 7 columns, say each FID is a study object. The objects sharing common values in "FamilyID" are within the same "group". The "YDDW" indicates the names of the study objects.
Now I would like to compare whether two adjacent FID's names (values in "YDDW") are the same along the order of date (values in "Order_Date"). If the two adjacent names are the same, then I would like to assign value "A" to the object with earlier date (defined by "Order_Date") in a new column "Classification"; if the adjacent names are different, then assign "B" in the new column.
Below is the snapshot of the dataframe. The two FIDs 1506, 3388 belonging to the group 006 have the same names in "YDDW", then "A" will be assigned to the row of 3388 in "Classification" column; the two objects 40, 2369 within the group 027 have different names in "YDDW", then "B" will be assigned to the row of 2369 in "Classification" column.
How may I implement these? Thanks in advance!
答案1
得分: 0
根据 @Quang Hoang 的评论,你可以尝试这样做:
d = {True: "A", False: "B"}
输出:
print(df)
HighOverID FamilyID FID Order_Date Order_Year YDDW FamilyOrder Classification
0 1506 6 1506 2021-08-25 2021 val1 2 B
1 3388 6 3388 2019-01-14 2019 val1 1 A
2 40 27 40 2023-02-23 2023 val2 2 B
3 2369 27 2369 2020-11-10 2020 val3 1 B
4 1203 55 1203 2021-11-24 2021 val4 2 B
5 3238 55 3238 2019-07-09 2019 val4 1 A
英文:
IIUC and to build upon @Quang Hoang comment, you can try this :
d = {True: "A", False: "B"}
df["Classification"] = df["YDDW"].eq(df.groupby("FamilyID")["YDDW"].shift()).map(d)
Output :
print(df)
HighOverID FamilyID FID Order_Date Order_Year YDDW FamilyOrder Classification
0 1506 6 1506 2021-08-25 2021 val1 2 B
1 3388 6 3388 2019-01-14 2019 val1 1 A
2 40 27 40 2023-02-23 2023 val2 2 B
3 2369 27 2369 2020-11-10 2020 val3 1 B
4 1203 55 1203 2021-11-24 2021 val4 2 B
5 3238 55 3238 2019-07-09 2019 val4 1 A
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论