在一个pandas数据框中更改多列条目。

huangapple go评论70阅读模式
英文:

Change multiple column entries in a pandas dataframe

问题

我有一个大的pandas数据框:

df.head()

        Year    Average Elo    Club    Country    Level
0       2017    1283.267334    Kukesi    ALB    0
1       2018    1263.912195    Kukesi    ALB    0
2       2019    1212.714661    Kukesi    ALB    0
3       2020    1231.063379    Kukesi    ALB    0
4       2018    1213.269553    Laci    ALB    0

Country列包含了55个国家的缩写国家代码。我想要将每个国家代码更改为该国家的全名(例如,将ALB更改为Albania):

df['Country'].unique()

array(['ALB', 'AND', 'ARM', 'AUT', 'AZE', 'BEL', 'BHZ', 'BLR', 'BUL',
       'CRO', 'CYP', 'CZE', 'DEN', 'ENG', 'ESP', 'EST', 'FAR', 'FIN',
       'FRA', 'GEO', 'GER', 'GIB', 'GRE', 'HUN', 'IRL', 'ISL', 'ISR',
       'ITA', 'KAZ', 'KOS', 'LAT', 'LIE', 'LIT', 'LUX', 'MAC', 'MLT',
       'MNT', 'MOL', 'NED', 'NIR', 'NOR', 'POL', 'POR', 'ROM', 'RUS',
       'SCO', 'SLK', 'SMR', 'SRB', 'SUI', 'SVN', 'SWE', 'TUR', 'UKR',
       'WAL'], dtype=object)

我已经编写了一段代码,专门针对ALB

df.Country[df.Country=='ALB'] = 'Albania'

是否有一种方法可以对所有55个国家都执行此操作?

英文:

I have a large pandas dataframe

df.head()

        Year	Average Elo	Club	Country	Level
0   	2017	1283.267334	Kukesi	ALB	    0
1    	2018	1263.912195	Kukesi	ALB	    0
2   	2019	1212.714661	Kukesi	ALB	    0
3   	2020	1231.063379	Kukesi	ALB	    0
4   	2018	1213.269553	Laci	ALB	    0

The Country column contains abbreviated country codes for 55 countries. I want to change each of these country codes to the full name of the country (for example, ALB to Albania)

df['Country'].unique()

array(['ALB', 'AND', 'ARM', 'AUT', 'AZE', 'BEL', 'BHZ', 'BLR', 'BUL',
       'CRO', 'CYP', 'CZE', 'DEN', 'ENG', 'ESP', 'EST', 'FAR', 'FIN',
       'FRA', 'GEO', 'GER', 'GIB', 'GRE', 'HUN', 'IRL', 'ISL', 'ISR',
       'ITA', 'KAZ', 'KOS', 'LAT', 'LIE', 'LIT', 'LUX', 'MAC', 'MLT',
       'MNT', 'MOL', 'NED', 'NIR', 'NOR', 'POL', 'POR', 'ROM', 'RUS',
       'SCO', 'SLK', 'SMR', 'SRB', 'SUI', 'SVN', 'SWE', 'TUR', 'UKR',
       'WAL'], dtype=object)

I have written code that does this for ALB specifically:

df.Country[df.Country=='ALB'] = 'Albania'

Is there a way to do this for all 55 countries?

答案1

得分: 1

你可以使用 pycountry 来创建一个字典,以便从 3 个字母的代码中进行映射:

import pycountry

mapper = {c.alpha_3: c.name for c in pycountry.countries}
# {'ABW': '阿鲁巴', 'AFG': '阿富汗', 'AGO': '安哥拉',
#  'AIA': '安圭拉', 'ALA': '奥兰群岛', 'ALB': '阿尔巴尼亚',
#  'AND': '安道尔', ...}

df['Country'] = df['Country'].map(mapper)

但是你的格式并不完全标准,所以你也可以作为备用从前 3 个字母进行映射:

mapper = ({c.name[:3].upper(): c.name for c in pycountry.countries}
         |{c.alpha_3: c.name for c in pycountry.countries}
         )
英文:

You could use pycountry to craft a dictionary to map the names from the 3 letter code:

import pycountry

mapper = {c.alpha_3: c.name for c in pycountry.countries}
# {'ABW': 'Aruba', 'AFG': 'Afghanistan', 'AGO': 'Angola',
#  'AIA': 'Anguilla', 'ALA': 'Åland Islands', 'ALB': 'Albania',
#  'AND': 'Andorra', ...}

df['Country'] = df['Country'].map(mapper)

However your format is not exactly standard, so you could also map from the first 3 letters as a failback:

mapper = ({c.name[:3].upper(): c.name for c in pycountry.countries}
         |{c.alpha_3: c.name for c in pycountry.countries}
         )

答案2

得分: 1

是的,你可以自动化地将所有55个国家的国家代码替换为全名的过程。实现这一目标的一种方法是使用一个字典,将每个国家代码映射到其对应的全名。以下是一个示例:

import pandas as pd

# 示例数据
data = {
    'Year': [2017, 2018, 2019, 2020, 2018],
    'Average Elo Club': [1283.267334, 1263.912195, 1212.714661, 1231.063379, 1213.269553],
    'Country': ['ALB', 'AND', 'ARM', 'AUT', 'AZE'],
    'Level': [0, 0, 0, 0, 0]
}

df = pd.DataFrame(data)

# 国家代码到全名的映射
country_mapping = {
    'ALB': '阿尔巴尼亚',
    'AND': '安道尔',
    'ARM': '亚美尼亚',
    # 在这里添加更多的国家代码到全名的映射
    # 对于所有55个国家
}

# 用全名替换国家代码
df['Country'] = df['Country'].replace(country_mapping)

print(df.head())

在 "country_mapping" 字典中,你可以添加所有55个国家的映射,将国家代码作为键,相应的全名作为值。当你调用 "df['Country'].replace(country_mapping)" 时,它将为所有国家将 "Country" 列中的国家代码替换为相应的全名。

英文:

Yes, you can automate the process of replacing country codes with full names for all 55 countries. One way to achieve this is by using a dictionary that maps each country code to its corresponding full name. Here's an example:

import pandas as pd

# Sample data
data = {
    'Year': [2017, 2018, 2019, 2020, 2018],
    'Average Elo Club': [1283.267334, 1263.912195, 1212.714661, 1231.063379, 1213.269553],
    'Country': ['ALB', 'AND', 'ARM', 'AUT', 'AZE'],
    'Level': [0, 0, 0, 0, 0]
}

df = pd.DataFrame(data)

# Country code to full name mapping
country_mapping = {
    'ALB': 'Albania',
    'AND': 'Andorra',
    'ARM': 'Armenia',
    # Add more country code to full name mappings here
    # For all 55 countries
}

# Replace country codes with full names
df['Country'] = df['Country'].replace(country_mapping)

print(df.head())

In the "country_mapping" dictionary, you can add mappings for all 55 countries, specifying the country code as the key and the corresponding full name as the value. When you call "df['Country'].replace(country_mapping)", it will replace the country codes in the "Country" column with the respective full names for all countries.

huangapple
  • 本文由 发表于 2023年7月3日 19:01:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76604122.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定