使用`sns.lmplot`创建散点图,分别对每一行的两组数据求和。

huangapple go评论56阅读模式
英文:

Making sns.lmplot, scatterplot with two groups of data summed on each row respectively

问题

基本上,我的数据集看起来是这样的。

      Private Sector Expenditure  Public Sector Expenditure  \
year                                                          
2001                     20.4502                    11.8767   
2002                     20.5501                    13.1333   
2003                     20.5362                    13.4328   
2004                     25.6956                    14.7190   
2005                     25.6956                    15.5087   
2006                     32.8184                    17.1671   
2007                     42.2216                    21.0410   
2008                     51.0546                    21.0410   
2009                     36.9461                    23.1826   
2010                     37.7380                    25.4141   
2011                     44.5643                    28.1917   
2012                     42.4928                    28.2885   
2013                     43.3318                    30.6922   
2014                     50.0689                    33.0973   
2015                     55.1194                    37.2753   
2016                     53.4095                    37.9928   
2017                     53.8613                    36.7543   

      Content Services Revenue  Hardware Revenue  IT Services Revenue  \
year                                                                    
2001                      13.2              29.8                 32.0   
2002                      16.0              27.4                 36.5   
2003                       9.0              61.7                 25.6   
2004                      12.0              61.1                 26.8   
2005                       6.3              64.9                 25.4   
2006                       6.3              41.5                 29.1   
2007                       9.8              52.1                 44.2   
2008                      10.9              61.6                 62.4   
2009                      13.0             161.0                 71.0   
2010                      15.0             137.0                 75.0   
2011                      22.0             139.0                 67.0   
2012                      15.0             139.0                 75.0   
2013                      19.0             159.0                 75.0   
2014                      21.0             170.0                100.0   
2015                      21.0             205.0                102.0   
2016                      17.0             193.0                106.0   
2017                       0.0             188.0                207.0   

      Software Revenue  Telecommunication Services Revenue  \
year                                                         
2001               9.0                                58.5   
2002              10.2                                60.7   
2003              37.6                                16.6   
2004              32.8                                16.4   
2005              45.9                                15.8   
2006              16.8                                54.9   
2007              16.9                                58.3   
2008              21.3                                72.0   
2009              64.0                                94.0   
2010              30.0                               106.0   
2011              33.0                                97.0   
2012              33.0                                92.0   
2013              97.0                               108.0   
2014             105.0                               110.0   
2015             102.0                                99.0   
2016              74.0                                90.0   
2017              69.0                                79.0   

      Total Mobile Subscriptions  
year                              
2001                   2877017.0  
2002                   3067033.0  
2003                   3358817.0  
2004                   3675142.0  
2005                   4090633.0  
2006                   4391733.0  
2007                   5073833.0  
2008                   6112742.0  
2009                   6576875.0  
2010                   7058117.0  
2011                   7540733.0  
2012                   7868608.0  
2013                   8235317.0  
2014                   8273658.0  
2015                   8140783.0  
2016                   8312475.0  
2017                   8427542.0

我试图制作一个 seaborn.lmplot,横坐标为 ['Private Sector Expenditure', 'Public Sector Expenditure'],纵坐标为 ['Content Services Revenue', 'Hardware Revenue', 'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue'],其中每行的列被求和,以返回每年的一个值。

filled_revenue = df_final.groupby(sum(['Private Sector Expenditure', 'Public Sector Expenditure']))
filled_expenditure = df_final.groupby(sum(['Content Services Revenue', 'Hardware Revenue', 
                                       'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue']))

sns.lmplot(data=df_final, x=filled_expenditure, y=filled_revenue)

我尝试这样做,但显然有些问题,我没有足够的经验来理解如何对数据进行逐行子集和求和。

英文:

Basically, my dataset looks like this.

      Private Sector Expenditure  Public Sector Expenditure  \
year                                                          
2001                     20.4502                    11.8767   
2002                     20.5501                    13.1333   
2003                     20.5362                    13.4328   
2004                     25.6956                    14.7190   
2005                     25.6956                    15.5087   
2006                     32.8184                    17.1671   
2007                     42.2216                    21.0410   
2008                     51.0546                    21.0410   
2009                     36.9461                    23.1826   
2010                     37.7380                    25.4141   
2011                     44.5643                    28.1917   
2012                     42.4928                    28.2885   
2013                     43.3318                    30.6922   
2014                     50.0689                    33.0973   
2015                     55.1194                    37.2753   
2016                     53.4095                    37.9928   
2017                     53.8613                    36.7543   
Content Services Revenue  Hardware Revenue  IT Services Revenue  \
year                                                                    
2001                      13.2              29.8                 32.0   
2002                      16.0              27.4                 36.5   
2003                       9.0              61.7                 25.6   
2004                      12.0              61.1                 26.8   
2005                       6.3              64.9                 25.4   
2006                       6.3              41.5                 29.1   
2007                       9.8              52.1                 44.2   
2008                      10.9              61.6                 62.4   
2009                      13.0             161.0                 71.0   
2010                      15.0             137.0                 75.0   
2011                      22.0             139.0                 67.0   
2012                      15.0             139.0                 75.0   
2013                      19.0             159.0                 75.0   
2014                      21.0             170.0                100.0   
2015                      21.0             205.0                102.0   
2016                      17.0             193.0                106.0   
2017                       0.0             188.0                207.0   
Software Revenue  Telecommunication Services Revenue  \
year                                                         
2001               9.0                                58.5   
2002              10.2                                60.7   
2003              37.6                                16.6   
2004              32.8                                16.4   
2005              45.9                                15.8   
2006              16.8                                54.9   
2007              16.9                                58.3   
2008              21.3                                72.0   
2009              64.0                                94.0   
2010              30.0                               106.0   
2011              33.0                                97.0   
2012              33.0                                92.0   
2013              97.0                               108.0   
2014             105.0                               110.0   
2015             102.0                                99.0   
2016              74.0                                90.0   
2017              69.0                                79.0   
Total Mobile Subscriptions  
year                              
2001                   2877017.0  
2002                   3067033.0  
2003                   3358817.0  
2004                   3675142.0  
2005                   4090633.0  
2006                   4391733.0  
2007                   5073833.0  
2008                   6112742.0  
2009                   6576875.0  
2010                   7058117.0  
2011                   7540733.0  
2012                   7868608.0  
2013                   8235317.0  
2014                   8273658.0  
2015                   8140783.0  
2016                   8312475.0  
2017                   8427542.0  

I am trying to make a seaborn.lmplot of ['Private Sector Expenditure', 'Public Sector Expenditure'] on the x-axis and ['Content Services Revenue', 'Hardware Revenue', 'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue'] on the y-axis where the columns are summed up every row to return one value for each year on the x and y axis.

filled_revenue = df_final.groupby(sum(['Private Sector Expenditure', 'Public Sector Expenditure']))
filled_expenditure = df_final.groupby(sum(['Content Services Revenue', 'Hardware Revenue', 
'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue']))
sns.lmplot(data = df_final, x = filled_expenditure, y = filled_revenue)

i tried doing this but clearly there's something wrong and i'm not experienced enough to understand how to subset and sum the data per row

答案1

得分: 1

我认为您将尝试将**"私营部门支出","公共部门支出" => X轴"内容服务收入","硬件收入","IT服务收入","软件收入","电信服务收入" => Y轴**进行求和。

因此,您的最终代码将是:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame({
  "Private Sector Expenditure": [20.4502, 20.5501, 20.5362, 25.6956, 25.6956, 32.8184, 42.2216, 51.0546, 36.9461, 37.7380, 44.5643, 42.4928, 43.3318, 50.0689, 55.1194, 53.4095, 53.8613],
  "Public Sector Expenditure": [11.8767, 13.1333, 13.4328, 14.7190, 15.5087, 17.1671, 21.0410, 21.0410, 23.1826, 25.4141, 28.1917, 28.2885, 30.6922, 33.0973, 37.2753, 37.9928, 36.7543],
  "Content Services Revenue": [13.2, 16.0, 9.0, 12.0, 6.3, 6.3, 9.8, 10.9, 13.0, 15.0, 22.0, 15.0, 19.0, 21.0, 21.0, 17.0, 0.0],
  "Hardware Revenue": [29.8, 27.4, 61.7, 61.1, 64.9, 41.5, 52.1, 61.6, 161.0, 137.0, 139.0, 139.0, 159.0, 170.0, 205.0, 193.0, 188.0],
  "IT Services Revenue": [32.0, 36.5, 25.6, 26.8, 25.4, 29.1, 44.2, 62.4, 71.0, 75.0, 67.0, 75.0, 75.0, 100.0, 102.0, 106.0, 207.0],
  "Software Revenue": [9.0, 10.2, 37.6, 32.8, 45.9, 16.8, 16.9, 21.3, 64.0, 30.0, 33.0, 33.0, 97.0, 105.0, 102.0, 74.0, 69.0],
  "Telecommunication Services Revenue": [58.5, 60.7, 16.6, 16.4, 15.8, 54.9, 58.3, 72.0, 94.0, 106.0, 97.0, 92.0, 108.0, 110.0, 99.0, 90.0, 79.0],
  "Total Mobile Subscriptions": [2877017.0, 3067033.0, 3358817.0, 3675142.0, 4090633.0, 4391733.0, 5073833.0, 6112742.0, 6576875.0, 7058117.0, 7540733.0, 7868608.0, 8235317.0, 8273658.0, 8140783.0, 8312475.0, 8427542.0]
}, index=[2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017])

df["Expenditure"] = df[["Private Sector Expenditure", "Public Sector Expenditure"]].sum(axis=1)
df["Revenue"] = df[["Content Services Revenue", "Hardware Revenue", "IT Services Revenue", "Software Revenue", "Telecommunication Services Revenue"]].sum(axis=1)

sns.lmplot(data=df, x="Expenditure", y="Revenue")

plt.show()

结果将是:

使用`sns.lmplot`创建散点图,分别对每一行的两组数据求和。

英文:

I think you will try to sum "Private Sector Expenditure", "Public Sector Expenditure" => X-axis and "Content Services Revenue", "Hardware Revenue", "IT Services Revenue", "Software Revenue", "Telecommunication Services Revenue" => Y-axis

So your final code will be for your data is :

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({
"Private Sector Expenditure": [20.4502, 20.5501, 20.5362, 25.6956, 25.6956, 32.8184, 42.2216, 51.0546, 36.9461, 37.7380, 44.5643, 42.4928, 43.3318, 50.0689, 55.1194, 53.4095, 53.8613],
"Public Sector Expenditure": [11.8767, 13.1333, 13.4328, 14.7190, 15.5087, 17.1671, 21.0410, 21.0410, 23.1826, 25.4141, 28.1917, 28.2885, 30.6922, 33.0973, 37.2753, 37.9928, 36.7543],
"Content Services Revenue": [13.2, 16.0, 9.0, 12.0, 6.3, 6.3, 9.8, 10.9, 13.0, 15.0, 22.0, 15.0, 19.0, 21.0, 21.0, 17.0, 0.0],
"Hardware Revenue": [29.8, 27.4, 61.7, 61.1, 64.9, 41.5, 52.1, 61.6, 161.0, 137.0, 139.0, 139.0, 159.0, 170.0, 205.0, 193.0, 188.0],
"IT Services Revenue": [32.0, 36.5, 25.6, 26.8, 25.4, 29.1, 44.2, 62.4, 71.0, 75.0, 67.0, 75.0, 75.0, 100.0, 102.0, 106.0, 207.0],
"Software Revenue": [9.0, 10.2, 37.6, 32.8, 45.9, 16.8, 16.9, 21.3, 64.0, 30.0, 33.0, 33.0, 97.0, 105.0, 102.0, 74.0, 69.0],
"Telecommunication Services Revenue": [58.5, 60.7, 16.6, 16.4, 15.8, 54.9, 58.3, 72.0, 94.0, 106.0, 97.0, 92.0, 108.0, 110.0, 99.0, 90.0, 79.0],
"Total Mobile Subscriptions": [2877017.0, 3067033.0, 3358817.0, 3675142.0, 4090633.0, 4391733.0, 5073833.0, 6112742.0, 6576875.0, 7058117.0, 7540733.0, 7868608.0, 8235317.0, 8273658.0, 8140783.0, 8312475.0, 8427542.0]
}, index=[2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017])
df["Expenditure"] = df[["Private Sector Expenditure", "Public Sector Expenditure"]].sum(axis=1)
df["Revenue"] = df[["Content Services Revenue", "Hardware Revenue", "IT Services Revenue", "Software Revenue", "Telecommunication Services Revenue"]].sum(axis=1)
sns.lmplot(data=df, x="Expenditure", y="Revenue")
plt.show()

Result will be :

使用`sns.lmplot`创建散点图,分别对每一行的两组数据求和。

huangapple
  • 本文由 发表于 2023年2月6日 18:50:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360329.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定