英文:
Python - Link 2 columns
问题
以下是您要翻译的内容:
"I want to create a data frame to link 2 columns together (customer ID to each order ID the customer placed). The row index + 1 correlates to the customer ID. Is there a way to do this through mapping?"
"Data:" invoice_df
839FKFW2LLX4LMBB,27-05-2016,INBUX904GIHI8YBD,LJKS5NK6788CYMUU,2016-05-31 07:00:00+02:00,['David Bishop'],469,Breakfast
97OX39BGVMHODLJM,27-09-2018,J0MMOOPP709DIDIE,LJKS5NK6788CYMUU,2018-10-01 20:00:00+02:00,['David Bishop'],22,Dinner
041ORQM5OIHTIU6L,24-08-2014,E4UJLQNCI16UX5CS,LJKS5NK6788CYMUU,2014-08-23 14:00:00+02:00,['Karen Stansell'],314,Lunch
YT796QI18WNGZ7ZJ,12-04-2014,C9SDFHF7553BE247,LJKS5NK6788CYMUU,2014-04-07 21:00:00+02:00,['Addie Patino'],438,Dinner
6YLROQT27B6HRF4E,28-07-2015,48EQXS6IHYNZDDZ5,LJKS5NK6788CYMUU,2015-07-27 14:00:00+02:00,['Addie Patino' 'Susan Guerrero'],690,Lunch
AT0R4DFYYAFOC88Q,21-07-2014,W48JPR1UYWJ18NC6,LJKS5NK6788CYMUU,2014-07-17 20:00:00+02:00,['David Bishop' 'Susan Guerrero' 'Karen Stansell'],181,Dinner
2DDN2LHS7G85GKPQ,29-04-2014,1MKLAKBOE3SP7YUL,LJKS5NK6788CYMUU,2014-04-30 21:00:00+02:00,['Susan Guerrero' 'David Bishop'],14,Dinner
FM608JK1N01BPUQN,08-05-2014,E8WJZ1FOSKZD2MJN,36MFTZOYMTAJP1RK,2014-05-07 09:00:00+02:00,['Amanda Knowles' 'Cheryl Feaster' 'Ginger Hoagland' 'Michael White'],320,Breakfast
CK331XXNIBQT81QL,23-05-2015,CTZSFFKQTY7SBZ4J,36MFTZOYMTAJP1RK,2015-05-18 13:00:00+02:00,['Cheryl Feaster' 'Amanda Knowles' 'Ginger Hoagland'],697,Lunch
FESGKOQN2OZZWXY3,10-01-2016,US0NQYNNHS1SQJ4S,36MFTZOYMTAJP1RK,2016-01-14 22:00:00+01:00,['Glenn Gould' 'Amanda Knowles' 'Ginger Hoagland' 'Michael White'],451,Dinner
YITOTLOF0MWZ0VYX,03-10-2016,RGYX8772307H78ON,36MFTZOYMTAJP1RK,2016-10-01 22:00:00+02:00,['Ginger Hoagland' 'Amanda Knowles' 'Michael White'],263,Dinner
8RIGCF74GUEQHQEE,23-07-2018,5XK0KTFTD6OAP9ZP,36MFTZOYMTAJP1RK,2018-07-27 08:00:00+02:00,['Amanda Knowles'],210,Breakfast
TH60C9D8TPYS7DGG,15-12-2016,KDSMP2VJ22HNEPYF,36MFTZOYMTAJP1RK,2016-12-13 08:00:00+01:00,['Cheryl Feaster' 'Bret Adams' 'Ginger Hoagland'],755,Breakfast
W1Y086SRAVUZU1AL,17-09-2017,8IUOYVS031QPROUG,36MFTZOYMTAJP1RK,2017-09-14 13:00:00+02:00,['Bret Adams'],469,Lunch
WKB58Q8BHLOFQAB5,31-08-2016,E2K2TQUMENXSI9RP,36MFTZOYMTAJP1RK,2016-09-03 14:00:00+02:00,['Michael White' 'Ginger Hoagland' 'Bret Adams'],502,Lunch
N8DOG58MW238BHA9,25-12-2018,KFR2TAYXZSVCHAA2,36MFTZOYMTAJP1RK,2018-12-20 12:00:00+01:00,['Ginger Hoagland' 'Cheryl Feaster' 'Glenn Gould' 'Bret Adams'],829,Lunch
DPDV9UGF0SUCYTGW,25-05-2017,6YV61SH7W9ECUZP0,36MFTZOYMTAJP1RK,2017-05-24 22:00:00+02:00,['Michael White'],708,Dinner
KNF3E3QTOQ22J269,20-06-2018,737T2U760
<details>
<summary>英文:</summary>
I want to create a data frame to link 2 columns together (customer ID to each order ID the customer placed). The row index + 1 correlates to the customer ID. Is there a way to do this through mapping?
**Data:** invoice_df
Order Id,Date,Meal Id,Company Id,Date of Meal,Participants,Meal Price,Type of Meal
839FKFW2LLX4LMBB,27-05-2016,INBUX904GIHI8YBD,LJKS5NK6788CYMUU,2016-05-31 07:00:00+02:00,['David Bishop'],469,Breakfast
97OX39BGVMHODLJM,27-09-2018,J0MMOOPP709DIDIE,LJKS5NK6788CYMUU,2018-10-01 20:00:00+02:00,['David Bishop'],22,Dinner
041ORQM5OIHTIU6L,24-08-2014,E4UJLQNCI16UX5CS,LJKS5NK6788CYMUU,2014-08-23 14:00:00+02:00,['Karen Stansell'],314,Lunch
YT796QI18WNGZ7ZJ,12-04-2014,C9SDFHF7553BE247,LJKS5NK6788CYMUU,2014-04-07 21:00:00+02:00,['Addie Patino'],438,Dinner
6YLROQT27B6HRF4E,28-07-2015,48EQXS6IHYNZDDZ5,LJKS5NK6788CYMUU,2015-07-27 14:00:00+02:00,['Addie Patino' 'Susan Guerrero'],690,Lunch
AT0R4DFYYAFOC88Q,21-07-2014,W48JPR1UYWJ18NC6,LJKS5NK6788CYMUU,2014-07-17 20:00:00+02:00,['David Bishop' 'Susan Guerrero' 'Karen Stansell'],181,Dinner
2DDN2LHS7G85GKPQ,29-04-2014,1MKLAKBOE3SP7YUL,LJKS5NK6788CYMUU,2014-04-30 21:00:00+02:00,['Susan Guerrero' 'David Bishop'],14,Dinner
FM608JK1N01BPUQN,08-05-2014,E8WJZ1FOSKZD2MJN,36MFTZOYMTAJP1RK,2014-05-07 09:00:00+02:00,['Amanda Knowles' 'Cheryl Feaster' 'Ginger Hoagland' 'Michael White'],320,Breakfast
CK331XXNIBQT81QL,23-05-2015,CTZSFFKQTY7SBZ4J,36MFTZOYMTAJP1RK,2015-05-18 13:00:00+02:00,['Cheryl Feaster' 'Amanda Knowles' 'Ginger Hoagland'],697,Lunch
FESGKOQN2OZZWXY3,10-01-2016,US0NQYNNHS1SQJ4S,36MFTZOYMTAJP1RK,2016-01-14 22:00:00+01:00,['Glenn Gould' 'Amanda Knowles' 'Ginger Hoagland' 'Michael White'],451,Dinner
YITOTLOF0MWZ0VYX,03-10-2016,RGYX8772307H78ON,36MFTZOYMTAJP1RK,2016-10-01 22:00:00+02:00,['Ginger Hoagland' 'Amanda Knowles' 'Michael White'],263,Dinner
8RIGCF74GUEQHQEE,23-07-2018,5XK0KTFTD6OAP9ZP,36MFTZOYMTAJP1RK,2018-07-27 08:00:00+02:00,['Amanda Knowles'],210,Breakfast
TH60C9D8TPYS7DGG,15-12-2016,KDSMP2VJ22HNEPYF,36MFTZOYMTAJP1RK,2016-12-13 08:00:00+01:00,['Cheryl Feaster' 'Bret Adams' 'Ginger Hoagland'],755,Breakfast
W1Y086SRAVUZU1AL,17-09-2017,8IUOYVS031QPROUG,36MFTZOYMTAJP1RK,2017-09-14 13:00:00+02:00,['Bret Adams'],469,Lunch
WKB58Q8BHLOFQAB5,31-08-2016,E2K2TQUMENXSI9RP,36MFTZOYMTAJP1RK,2016-09-03 14:00:00+02:00,['Michael White' 'Ginger Hoagland' 'Bret Adams'],502,Lunch
N8DOG58MW238BHA9,25-12-2018,KFR2TAYXZSVCHAA2,36MFTZOYMTAJP1RK,2018-12-20 12:00:00+01:00,['Ginger Hoagland' 'Cheryl Feaster' 'Glenn Gould' 'Bret Adams'],829,Lunch
DPDV9UGF0SUCYTGW,25-05-2017,6YV61SH7W9ECUZP0,36MFTZOYMTAJP1RK,2017-05-24 22:00:00+02:00,['Michael White'],708,Dinner
KNF3E3QTOQ22J269,20-06-2018,737T2U7604ABDFDF,36MFTZOYMTAJP1RK,2018-06-15 07:00:00+02:00,['Glenn Gould' 'Cheryl Feaster' 'Ginger Hoagland' 'Amanda Knowles'],475,Breakfast
LEED1HY47M8BR5VL,22-10-2017,I22P10IQQD06MO45,36MFTZOYMTAJP1RK,2017-10-22 14:00:00+02:00,['Glenn Gould'],27,Lunch
LSJPNJQLDTIRNWAL,27-01-2017,247IIVNN6CXGWINB,36MFTZOYMTAJP1RK,2017-01-23 13:00:00+01:00,['Amanda Knowles' 'Bret Adams'],672,Lunch
6UX5RMHJ1GK1F9YQ,24-08-2014,LL4AOPXDM8V5KP5S,H3JRC7XX7WJAD4ZO,2014-08-27 12:00:00+02:00,['Anthony Emerson' 'Irvin Gentry' 'Melba Inlow'],552,Lunch
5SYB15QEFWD1E4Q4,09-07-2017,KZI0VRU30GLSDYHA,H3JRC7XX7WJAD4ZO,2017-07-13 08:00:00+02:00,"['Anthony Emerson' 'Emma Steitz' 'Melba Inlow' 'Irvin Gentry'
'Kelly Killebrew']",191,Breakfast
W5S8VZ61WJONS4EE,25-03-2017,XPSPBQF1YLIG26N1,H3JRC7XX7WJAD4ZO,2017-03-25 07:00:00+01:00,['Irvin Gentry' 'Kelly Killebrew'],471,Breakfast
795SVIJKO8KS3ZEL,05-01-2015,HHTLB8M9U0TGC7Z4,H3JRC7XX7WJAD4ZO,2015-01-06 22:00:00+01:00,['Emma Steitz'],588,Dinner
8070KEFYSSPWPCD0,05-08-2014,VZ2OL0LREO8V9RKF,H3JRC7XX7WJAD4ZO,2014-08-09 12:00:00+02:00,['Lewis Eyre'],98,Lunch
RUQOHROBGBOSNUO4,10-06-2016,R3LFUK1WFDODC1YF,H3JRC7XX7WJAD4ZO,2016-06-09 08:00:00+02:00,['Anthony Emerson' 'Kelly Killebrew' 'Lewis Eyre'],516,Breakfast
6P91QRADC2O9WOVT,25-09-2016,L2F2HEGB6Q141080,H3JRC7XX7WJAD4ZO,2016-09-26 07:00:00+02:00,"['Kelly Killebrew' 'Lewis Eyre' 'Irvin Gentry' 'Emma Steitz'
'Anthony Emerson']",664,Breakfast
***
**Code:**
Function to convert string ['name' 'name2'] to list ['name', 'name2']
Returns a list of participant names
def string_to_list(participant_string): return re.findall(r"'(.*?)'", participant_string)
invoice_df["Participants"] = invoice_df["Participants"].apply(string_to_list)
Obtain an array of all unique customer names
customers = invoice_df["Participants"].explode().unique()
Create new customer dataframe
customers_df = pd.DataFrame(customers, columns = ["CustomerName"])
Add customer id
customers_df["customer_id"] = customers_df.index + 1
Create a first_name and last_name column
customers_df["first_name"] = customers_df["CustomerName"].apply(lambda x: x.split(" "[0])
Splice the list 1: in the event the person has multiple last names
customers_df["last_name"] = customers_df["CustomerName"].apply(lambda x: x.split(" ")[1])
</details>
# 答案1
**得分**: 1
### 解决方案
```python
# 查找所有客户姓名的出现次数
# 然后使用explode将列表中的值转换为行
cust = invoice_df['Participants'].str.findall(r"'(.*?)'").explode()
# 与订单号连接
customers_df = invoice_df[['Order Id']].join(cust)
# 使用factorize对参与者中的唯一值进行编码
customers_df['Customer Id'] = customers_df['Participants'].factorize()[0] + 1
结果
Order Id Participants Customer Id
0 839FKFW2LLX4LMBB David Bishop 1
1 97OX39BGVMHODLJM David Bishop 1
2 041ORQM5OIHTIU6L Karen Stansell 2
3 YT796QI18WNGZ7ZJ Addie Patino 3
4 6YLROQT27B6HRF4E Addie Patino 3
4 6YLROQT27B6HRF4E Susan Guerrero 4
5 AT0R4DFYYAFOC88Q David Bishop 1
5 AT0R4DFYYAFOC88Q Susan Guerrero 4
5 AT0R4DFYYAFOC88Q Karen Stansell 2
6 2DDN2LHS7G85GKPQ Susan Guerrero 4
6 2DDN2LHS7G85GKPQ David Bishop 1
7 FM608JK1N01BPUQN Amanda Knowles 5
7 FM608JK1N01BPUQN Cheryl Feaster 6
7 FM608JK1N01BPUQN Ginger Hoagland 7
7 FM608JK1N01BPUQN Michael White 8
8 CK331XXNIBQT81QL Cheryl Feaster 6
8 CK331XXNIBQT81QL Amanda Knowles 5
8 CK331XXNIBQT81QL Ginger Hoagland 7
9 FESGKOQN2OZZWXY3 Glenn Gould 9
9 FESGKOQN2OZZWXY3 Amanda Knowles 5
9 FESGKOQN2OZZWXY3 Ginger Hoagland 7
9 FESGKOQN2OZZWXY3 Michael White 8
10 YITOTLOF0MWZ0VYX Ginger Hoagland 7
10 YITOTLOF0MWZ0VYX Amanda Knowles 5
10 YITOTLOF0MWZ0VYX Michael White 8
11 8RIGCF74GUEQHQEE Amanda Knowles 5
12 TH60C9D8TPYS7DGG Cheryl Feaster 6
12 TH60C9D8TPYS7DGG Bret Adams 10
12 TH60C9D8TPYS7DGG Ginger Hoagland 7
13 W1Y086SRAVUZU1AL Bret Adams 10
14 WKB58Q8BHLOFQAB5 Michael White 8
14 WKB58Q8BHLOFQAB5 Ginger Hoagland 7
14 WKB58Q8BHLOFQAB5 Bret Adams 10
15 N8DOG58MW238BHA9 Ginger Hoagland 7
15 N8DOG58MW238BHA9 Cheryl Feaster 6
15 N8DOG58MW238BHA9 Glenn Gould 9
15 N8DOG58MW238BHA9 Bret Adams 10
16 DPDV9UGF0SUCYTGW Michael White 8
17 KNF3E3QTOQ22J269 Glenn Gould 9
17 KNF3E3QTOQ22J269 Cheryl Feaster 6
17 KNF3E3QTOQ22J269 Ginger Hoagland 7
17 KNF3E3QTOQ22J269 Amanda Knowles 5
18 LEED1HY47M8BR5VL Glenn Gould 9
19 LSJPNJQLDTIRNWAL Amanda Knowles 5
19 LSJPNJQLDTIRNWAL Bret Adams 10
20 6UX5RMHJ1GK1F9YQ Anthony Emerson 11
20 6UX5RMHJ1GK1F9YQ Irvin Gentry 12
20 6UX5RMHJ1GK1F9YQ Melba Inlow 13
21 5SYB15QEFWD1E4Q4 Anthony Emerson 11
21 5SYB15QEFWD1E4Q4 Emma Steitz 14
21 5SYB15QEFWD1E4Q4 Melba Inlow 13
21 5SYB15QEFWD1E4Q4 Irvin Gentry 12
21 5SYB15QEFWD1E4Q4 Kelly Killebrew 15
22 W5S8VZ61WJONS4EE Irvin Gentry 12
22 W5S8VZ61WJONS4EE Kelly Killebrew 15
23 795SVIJKO8KS3ZEL Emma Steitz 14
24 8070KEFYSSPWPCD0 Lewis Eyre 16
25 RUQOHROBGBOSNUO4 Anthony Emerson 11
25 RUQOHROBGBOSNUO4 Kelly Killebrew 15
25 RUQOHROBGBOSNUO4 Lewis Eyre 16
26 6P91QRADC2O9WOVT Kelly Killebrew 15
26 6P91QRADC2O9WOVT Lewis Eyre 16
26 6P91QRADC2O9WOVT Irvin Gentry 12
26 6P91QRADC2O9WOVT Emma Steitz 14
26 6P91QRADC2O9WOVT Anthony Emerson 11
英文:
Solution
# Find all the occurrences of customer names
# then explode to convert values in lists to rows
cust = invoice_df['Participants'].str.findall(r"'(.*?)'").explode()
# Join with orderid
customers_df = invoice_df[['Order Id']].join(cust)
# factorize to encode the unique values in participants
customers_df['Customer Id'] = customers_df['Participants'].factorize()[0] + 1
Result
Order Id Participants Customer Id
0 839FKFW2LLX4LMBB David Bishop 1
1 97OX39BGVMHODLJM David Bishop 1
2 041ORQM5OIHTIU6L Karen Stansell 2
3 YT796QI18WNGZ7ZJ Addie Patino 3
4 6YLROQT27B6HRF4E Addie Patino 3
4 6YLROQT27B6HRF4E Susan Guerrero 4
5 AT0R4DFYYAFOC88Q David Bishop 1
5 AT0R4DFYYAFOC88Q Susan Guerrero 4
5 AT0R4DFYYAFOC88Q Karen Stansell 2
6 2DDN2LHS7G85GKPQ Susan Guerrero 4
6 2DDN2LHS7G85GKPQ David Bishop 1
7 FM608JK1N01BPUQN Amanda Knowles 5
7 FM608JK1N01BPUQN Cheryl Feaster 6
7 FM608JK1N01BPUQN Ginger Hoagland 7
7 FM608JK1N01BPUQN Michael White 8
8 CK331XXNIBQT81QL Cheryl Feaster 6
8 CK331XXNIBQT81QL Amanda Knowles 5
8 CK331XXNIBQT81QL Ginger Hoagland 7
9 FESGKOQN2OZZWXY3 Glenn Gould 9
9 FESGKOQN2OZZWXY3 Amanda Knowles 5
9 FESGKOQN2OZZWXY3 Ginger Hoagland 7
9 FESGKOQN2OZZWXY3 Michael White 8
10 YITOTLOF0MWZ0VYX Ginger Hoagland 7
10 YITOTLOF0MWZ0VYX Amanda Knowles 5
10 YITOTLOF0MWZ0VYX Michael White 8
11 8RIGCF74GUEQHQEE Amanda Knowles 5
12 TH60C9D8TPYS7DGG Cheryl Feaster 6
12 TH60C9D8TPYS7DGG Bret Adams 10
12 TH60C9D8TPYS7DGG Ginger Hoagland 7
13 W1Y086SRAVUZU1AL Bret Adams 10
14 WKB58Q8BHLOFQAB5 Michael White 8
14 WKB58Q8BHLOFQAB5 Ginger Hoagland 7
14 WKB58Q8BHLOFQAB5 Bret Adams 10
15 N8DOG58MW238BHA9 Ginger Hoagland 7
15 N8DOG58MW238BHA9 Cheryl Feaster 6
15 N8DOG58MW238BHA9 Glenn Gould 9
15 N8DOG58MW238BHA9 Bret Adams 10
16 DPDV9UGF0SUCYTGW Michael White 8
17 KNF3E3QTOQ22J269 Glenn Gould 9
17 KNF3E3QTOQ22J269 Cheryl Feaster 6
17 KNF3E3QTOQ22J269 Ginger Hoagland 7
17 KNF3E3QTOQ22J269 Amanda Knowles 5
18 LEED1HY47M8BR5VL Glenn Gould 9
19 LSJPNJQLDTIRNWAL Amanda Knowles 5
19 LSJPNJQLDTIRNWAL Bret Adams 10
20 6UX5RMHJ1GK1F9YQ Anthony Emerson 11
20 6UX5RMHJ1GK1F9YQ Irvin Gentry 12
20 6UX5RMHJ1GK1F9YQ Melba Inlow 13
21 5SYB15QEFWD1E4Q4 Anthony Emerson 11
21 5SYB15QEFWD1E4Q4 Emma Steitz 14
21 5SYB15QEFWD1E4Q4 Melba Inlow 13
21 5SYB15QEFWD1E4Q4 Irvin Gentry 12
21 5SYB15QEFWD1E4Q4 Kelly Killebrew 15
22 W5S8VZ61WJONS4EE Irvin Gentry 12
22 W5S8VZ61WJONS4EE Kelly Killebrew 15
23 795SVIJKO8KS3ZEL Emma Steitz 14
24 8070KEFYSSPWPCD0 Lewis Eyre 16
25 RUQOHROBGBOSNUO4 Anthony Emerson 11
25 RUQOHROBGBOSNUO4 Kelly Killebrew 15
25 RUQOHROBGBOSNUO4 Lewis Eyre 16
26 6P91QRADC2O9WOVT Kelly Killebrew 15
26 6P91QRADC2O9WOVT Lewis Eyre 16
26 6P91QRADC2O9WOVT Irvin Gentry 12
26 6P91QRADC2O9WOVT Emma Steitz 14
26 6P91QRADC2O9WOVT Anthony Emerson 11
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论