在一个pandas数据框中更改多列条目。

huangapple go评论108阅读模式
英文:

Change multiple column entries in a pandas dataframe

问题

我有一个大的pandas数据框:

  1. df.head()
  2. Year Average Elo Club Country Level
  3. 0 2017 1283.267334 Kukesi ALB 0
  4. 1 2018 1263.912195 Kukesi ALB 0
  5. 2 2019 1212.714661 Kukesi ALB 0
  6. 3 2020 1231.063379 Kukesi ALB 0
  7. 4 2018 1213.269553 Laci ALB 0

Country列包含了55个国家的缩写国家代码。我想要将每个国家代码更改为该国家的全名(例如,将ALB更改为Albania):

  1. df['Country'].unique()
  2. array(['ALB', 'AND', 'ARM', 'AUT', 'AZE', 'BEL', 'BHZ', 'BLR', 'BUL',
  3. 'CRO', 'CYP', 'CZE', 'DEN', 'ENG', 'ESP', 'EST', 'FAR', 'FIN',
  4. 'FRA', 'GEO', 'GER', 'GIB', 'GRE', 'HUN', 'IRL', 'ISL', 'ISR',
  5. 'ITA', 'KAZ', 'KOS', 'LAT', 'LIE', 'LIT', 'LUX', 'MAC', 'MLT',
  6. 'MNT', 'MOL', 'NED', 'NIR', 'NOR', 'POL', 'POR', 'ROM', 'RUS',
  7. 'SCO', 'SLK', 'SMR', 'SRB', 'SUI', 'SVN', 'SWE', 'TUR', 'UKR',
  8. 'WAL'], dtype=object)

我已经编写了一段代码,专门针对ALB

  1. df.Country[df.Country=='ALB'] = 'Albania'

是否有一种方法可以对所有55个国家都执行此操作?

英文:

I have a large pandas dataframe

  1. df.head()
  2. Year Average Elo Club Country Level
  3. 0 2017 1283.267334 Kukesi ALB 0
  4. 1 2018 1263.912195 Kukesi ALB 0
  5. 2 2019 1212.714661 Kukesi ALB 0
  6. 3 2020 1231.063379 Kukesi ALB 0
  7. 4 2018 1213.269553 Laci ALB 0

The Country column contains abbreviated country codes for 55 countries. I want to change each of these country codes to the full name of the country (for example, ALB to Albania)

  1. df['Country'].unique()
  2. array(['ALB', 'AND', 'ARM', 'AUT', 'AZE', 'BEL', 'BHZ', 'BLR', 'BUL',
  3. 'CRO', 'CYP', 'CZE', 'DEN', 'ENG', 'ESP', 'EST', 'FAR', 'FIN',
  4. 'FRA', 'GEO', 'GER', 'GIB', 'GRE', 'HUN', 'IRL', 'ISL', 'ISR',
  5. 'ITA', 'KAZ', 'KOS', 'LAT', 'LIE', 'LIT', 'LUX', 'MAC', 'MLT',
  6. 'MNT', 'MOL', 'NED', 'NIR', 'NOR', 'POL', 'POR', 'ROM', 'RUS',
  7. 'SCO', 'SLK', 'SMR', 'SRB', 'SUI', 'SVN', 'SWE', 'TUR', 'UKR',
  8. 'WAL'], dtype=object)

I have written code that does this for ALB specifically:

  1. df.Country[df.Country=='ALB'] = 'Albania'

Is there a way to do this for all 55 countries?

答案1

得分: 1

你可以使用 pycountry 来创建一个字典,以便从 3 个字母的代码中进行映射:

  1. import pycountry
  2. mapper = {c.alpha_3: c.name for c in pycountry.countries}
  3. # {'ABW': '阿鲁巴', 'AFG': '阿富汗', 'AGO': '安哥拉',
  4. # 'AIA': '安圭拉', 'ALA': '奥兰群岛', 'ALB': '阿尔巴尼亚',
  5. # 'AND': '安道尔', ...}
  6. df['Country'] = df['Country'].map(mapper)

但是你的格式并不完全标准,所以你也可以作为备用从前 3 个字母进行映射:

  1. mapper = ({c.name[:3].upper(): c.name for c in pycountry.countries}
  2. |{c.alpha_3: c.name for c in pycountry.countries}
  3. )
英文:

You could use pycountry to craft a dictionary to map the names from the 3 letter code:

  1. import pycountry
  2. mapper = {c.alpha_3: c.name for c in pycountry.countries}
  3. # {'ABW': 'Aruba', 'AFG': 'Afghanistan', 'AGO': 'Angola',
  4. # 'AIA': 'Anguilla', 'ALA': 'Åland Islands', 'ALB': 'Albania',
  5. # 'AND': 'Andorra', ...}
  6. df['Country'] = df['Country'].map(mapper)

However your format is not exactly standard, so you could also map from the first 3 letters as a failback:

  1. mapper = ({c.name[:3].upper(): c.name for c in pycountry.countries}
  2. |{c.alpha_3: c.name for c in pycountry.countries}
  3. )

答案2

得分: 1

是的,你可以自动化地将所有55个国家的国家代码替换为全名的过程。实现这一目标的一种方法是使用一个字典,将每个国家代码映射到其对应的全名。以下是一个示例:

  1. import pandas as pd
  2. # 示例数据
  3. data = {
  4. 'Year': [2017, 2018, 2019, 2020, 2018],
  5. 'Average Elo Club': [1283.267334, 1263.912195, 1212.714661, 1231.063379, 1213.269553],
  6. 'Country': ['ALB', 'AND', 'ARM', 'AUT', 'AZE'],
  7. 'Level': [0, 0, 0, 0, 0]
  8. }
  9. df = pd.DataFrame(data)
  10. # 国家代码到全名的映射
  11. country_mapping = {
  12. 'ALB': '阿尔巴尼亚',
  13. 'AND': '安道尔',
  14. 'ARM': '亚美尼亚',
  15. # 在这里添加更多的国家代码到全名的映射
  16. # 对于所有55个国家
  17. }
  18. # 用全名替换国家代码
  19. df['Country'] = df['Country'].replace(country_mapping)
  20. print(df.head())

在 "country_mapping" 字典中,你可以添加所有55个国家的映射,将国家代码作为键,相应的全名作为值。当你调用 "df['Country'].replace(country_mapping)" 时,它将为所有国家将 "Country" 列中的国家代码替换为相应的全名。

英文:

Yes, you can automate the process of replacing country codes with full names for all 55 countries. One way to achieve this is by using a dictionary that maps each country code to its corresponding full name. Here's an example:

  1. import pandas as pd
  2. # Sample data
  3. data = {
  4. 'Year': [2017, 2018, 2019, 2020, 2018],
  5. 'Average Elo Club': [1283.267334, 1263.912195, 1212.714661, 1231.063379, 1213.269553],
  6. 'Country': ['ALB', 'AND', 'ARM', 'AUT', 'AZE'],
  7. 'Level': [0, 0, 0, 0, 0]
  8. }
  9. df = pd.DataFrame(data)
  10. # Country code to full name mapping
  11. country_mapping = {
  12. 'ALB': 'Albania',
  13. 'AND': 'Andorra',
  14. 'ARM': 'Armenia',
  15. # Add more country code to full name mappings here
  16. # For all 55 countries
  17. }
  18. # Replace country codes with full names
  19. df['Country'] = df['Country'].replace(country_mapping)
  20. print(df.head())

In the "country_mapping" dictionary, you can add mappings for all 55 countries, specifying the country code as the key and the corresponding full name as the value. When you call "df['Country'].replace(country_mapping)", it will replace the country codes in the "Country" column with the respective full names for all countries.

huangapple
  • 本文由 发表于 2023年7月3日 19:01:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76604122.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定