英文:
Change multiple column entries in a pandas dataframe
问题
我有一个大的pandas数据框:
df.head()
Year Average Elo Club Country Level
0 2017 1283.267334 Kukesi ALB 0
1 2018 1263.912195 Kukesi ALB 0
2 2019 1212.714661 Kukesi ALB 0
3 2020 1231.063379 Kukesi ALB 0
4 2018 1213.269553 Laci ALB 0
Country
列包含了55个国家的缩写国家代码。我想要将每个国家代码更改为该国家的全名(例如,将ALB
更改为Albania
):
df['Country'].unique()
array(['ALB', 'AND', 'ARM', 'AUT', 'AZE', 'BEL', 'BHZ', 'BLR', 'BUL',
'CRO', 'CYP', 'CZE', 'DEN', 'ENG', 'ESP', 'EST', 'FAR', 'FIN',
'FRA', 'GEO', 'GER', 'GIB', 'GRE', 'HUN', 'IRL', 'ISL', 'ISR',
'ITA', 'KAZ', 'KOS', 'LAT', 'LIE', 'LIT', 'LUX', 'MAC', 'MLT',
'MNT', 'MOL', 'NED', 'NIR', 'NOR', 'POL', 'POR', 'ROM', 'RUS',
'SCO', 'SLK', 'SMR', 'SRB', 'SUI', 'SVN', 'SWE', 'TUR', 'UKR',
'WAL'], dtype=object)
我已经编写了一段代码,专门针对ALB
:
df.Country[df.Country=='ALB'] = 'Albania'
是否有一种方法可以对所有55个国家都执行此操作?
英文:
I have a large pandas dataframe
df.head()
Year Average Elo Club Country Level
0 2017 1283.267334 Kukesi ALB 0
1 2018 1263.912195 Kukesi ALB 0
2 2019 1212.714661 Kukesi ALB 0
3 2020 1231.063379 Kukesi ALB 0
4 2018 1213.269553 Laci ALB 0
The Country
column contains abbreviated country codes for 55 countries. I want to change each of these country codes to the full name of the country (for example, ALB
to Albania
)
df['Country'].unique()
array(['ALB', 'AND', 'ARM', 'AUT', 'AZE', 'BEL', 'BHZ', 'BLR', 'BUL',
'CRO', 'CYP', 'CZE', 'DEN', 'ENG', 'ESP', 'EST', 'FAR', 'FIN',
'FRA', 'GEO', 'GER', 'GIB', 'GRE', 'HUN', 'IRL', 'ISL', 'ISR',
'ITA', 'KAZ', 'KOS', 'LAT', 'LIE', 'LIT', 'LUX', 'MAC', 'MLT',
'MNT', 'MOL', 'NED', 'NIR', 'NOR', 'POL', 'POR', 'ROM', 'RUS',
'SCO', 'SLK', 'SMR', 'SRB', 'SUI', 'SVN', 'SWE', 'TUR', 'UKR',
'WAL'], dtype=object)
I have written code that does this for ALB
specifically:
df.Country[df.Country=='ALB'] = 'Albania'
Is there a way to do this for all 55 countries?
答案1
得分: 1
你可以使用 pycountry
来创建一个字典,以便从 3 个字母的代码中进行映射:
import pycountry
mapper = {c.alpha_3: c.name for c in pycountry.countries}
# {'ABW': '阿鲁巴', 'AFG': '阿富汗', 'AGO': '安哥拉',
# 'AIA': '安圭拉', 'ALA': '奥兰群岛', 'ALB': '阿尔巴尼亚',
# 'AND': '安道尔', ...}
df['Country'] = df['Country'].map(mapper)
但是你的格式并不完全标准,所以你也可以作为备用从前 3 个字母进行映射:
mapper = ({c.name[:3].upper(): c.name for c in pycountry.countries}
|{c.alpha_3: c.name for c in pycountry.countries}
)
英文:
You could use pycountry
to craft a dictionary to map
the names from the 3 letter code:
import pycountry
mapper = {c.alpha_3: c.name for c in pycountry.countries}
# {'ABW': 'Aruba', 'AFG': 'Afghanistan', 'AGO': 'Angola',
# 'AIA': 'Anguilla', 'ALA': 'Åland Islands', 'ALB': 'Albania',
# 'AND': 'Andorra', ...}
df['Country'] = df['Country'].map(mapper)
However your format is not exactly standard, so you could also map from the first 3 letters as a failback:
mapper = ({c.name[:3].upper(): c.name for c in pycountry.countries}
|{c.alpha_3: c.name for c in pycountry.countries}
)
答案2
得分: 1
是的,你可以自动化地将所有55个国家的国家代码替换为全名的过程。实现这一目标的一种方法是使用一个字典,将每个国家代码映射到其对应的全名。以下是一个示例:
import pandas as pd
# 示例数据
data = {
'Year': [2017, 2018, 2019, 2020, 2018],
'Average Elo Club': [1283.267334, 1263.912195, 1212.714661, 1231.063379, 1213.269553],
'Country': ['ALB', 'AND', 'ARM', 'AUT', 'AZE'],
'Level': [0, 0, 0, 0, 0]
}
df = pd.DataFrame(data)
# 国家代码到全名的映射
country_mapping = {
'ALB': '阿尔巴尼亚',
'AND': '安道尔',
'ARM': '亚美尼亚',
# 在这里添加更多的国家代码到全名的映射
# 对于所有55个国家
}
# 用全名替换国家代码
df['Country'] = df['Country'].replace(country_mapping)
print(df.head())
在 "country_mapping" 字典中,你可以添加所有55个国家的映射,将国家代码作为键,相应的全名作为值。当你调用 "df['Country'].replace(country_mapping)" 时,它将为所有国家将 "Country" 列中的国家代码替换为相应的全名。
英文:
Yes, you can automate the process of replacing country codes with full names for all 55 countries. One way to achieve this is by using a dictionary that maps each country code to its corresponding full name. Here's an example:
import pandas as pd
# Sample data
data = {
'Year': [2017, 2018, 2019, 2020, 2018],
'Average Elo Club': [1283.267334, 1263.912195, 1212.714661, 1231.063379, 1213.269553],
'Country': ['ALB', 'AND', 'ARM', 'AUT', 'AZE'],
'Level': [0, 0, 0, 0, 0]
}
df = pd.DataFrame(data)
# Country code to full name mapping
country_mapping = {
'ALB': 'Albania',
'AND': 'Andorra',
'ARM': 'Armenia',
# Add more country code to full name mappings here
# For all 55 countries
}
# Replace country codes with full names
df['Country'] = df['Country'].replace(country_mapping)
print(df.head())
In the "country_mapping" dictionary, you can add mappings for all 55 countries, specifying the country code as the key and the corresponding full name as the value. When you call "df['Country'].replace(country_mapping)", it will replace the country codes in the "Country" column with the respective full names for all countries.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论