英文:
ValueError: cannot reindex on an axis with duplicate labels while using assign
问题
I am trying to split the values inside the engine_type
column using _
delimiter using the following code:
df = pd.read_csv("/content/sample_data/used_cars.csv")
dds = df.assign(engines_type= lambda x: x['engine_type'].str.split(r'\s*_\s*').explode()).reset_index()
I am getting the following error:
ValueError: cannot reindex on an axis with duplicate labels
What could be the reason for this error?
Thanks in advance.
英文:
I am trying to split the values inside the engine_type
column using _
delimiter using the following code
df = pd.read_csv("/content/sample_data/used_cars.csv")
dds = df.assign(engines_type= lambda x: x['engine_type'].str.split(r'\s*_\s*').explode()).reset_index()
I am getting the following error
> ValueError: cannot reindex on an axis with duplicate labels
What could be the reason for this error?
Thanks in advance
答案1
得分: 0
尝试这种方法:
# 假设您有一个像这样的数据框:
df = pd.DataFrame({
'car_model': ['Renualt', 'Hyundai', 'Ford'],
'engine_type': ['Gas', 'Diesel_Petrol', 'Gas_Hybrid']
})
dds = (df.assign(engine_type=df['engine_type'].str.split(r'\s*_\s*'))
.explode('engine_type')
.reset_index(drop=True)
)
print(dds)
car_model engine_type
0 Renualt Gas
1 Hyundai Diesel
2 Hyundai Petrol
3 Ford Gas
4 Ford Hybrid
注意: 如果这不适用,您应该提供一个示例数据框和期望的输出。
英文:
Try this approach:<br>
# Say you have a df like:
df = pd.DataFrame({
'car_model': ['Renualt', 'Hyundai', 'Ford'],
'engine_type': ['Gas', 'Diesel_Petrol', 'Gas_Hybrid']
})
dds = (df.assign(engine_type=df['engine_type'].str.split(r'\s*_\s*'))
.explode('engine_type')
.reset_index(drop=True)
)
print(dds)
car_model engine_type
0 Renualt Gas
1 Hyundai Diesel
2 Hyundai Petrol
3 Ford Gas
4 Ford Hybrid
Note: If this doesn't help, you should provide a sample dataframe and a desired output.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论