2023年2月6日 18:50:57go评论87阅读模式

英文:

Making sns.lmplot, scatterplot with two groups of data summed on each row respectively

问题

基本上，我的数据集看起来是这样的。

      Private Sector Expenditure  Public Sector Expenditure  \
year                                                          
2001                     20.4502                    11.8767   
2002                     20.5501                    13.1333   
2003                     20.5362                    13.4328   
2004                     25.6956                    14.7190   
2005                     25.6956                    15.5087   
2006                     32.8184                    17.1671   
2007                     42.2216                    21.0410   
2008                     51.0546                    21.0410   
2009                     36.9461                    23.1826   
2010                     37.7380                    25.4141   
2011                     44.5643                    28.1917   
2012                     42.4928                    28.2885   
2013                     43.3318                    30.6922   
2014                     50.0689                    33.0973   
2015                     55.1194                    37.2753   
2016                     53.4095                    37.9928   
2017                     53.8613                    36.7543   
      Content Services Revenue  Hardware Revenue  IT Services Revenue  \
year                                                                    
2001                      13.2              29.8                 32.0   
2002                      16.0              27.4                 36.5   
2003                       9.0              61.7                 25.6   
2004                      12.0              61.1                 26.8   
2005                       6.3              64.9                 25.4   
2006                       6.3              41.5                 29.1   
2007                       9.8              52.1                 44.2   
2008                      10.9              61.6                 62.4   
2009                      13.0             161.0                 71.0   
2010                      15.0             137.0                 75.0   
2011                      22.0             139.0                 67.0   
2012                      15.0             139.0                 75.0   
2013                      19.0             159.0                 75.0   
2014                      21.0             170.0                100.0   
2015                      21.0             205.0                102.0   
2016                      17.0             193.0                106.0   
2017                       0.0             188.0                207.0   
      Software Revenue  Telecommunication Services Revenue  \
year                                                         
2001               9.0                                58.5   
2002              10.2                                60.7   
2003              37.6                                16.6   
2004              32.8                                16.4   
2005              45.9                                15.8   
2006              16.8                                54.9   
2007              16.9                                58.3   
2008              21.3                                72.0   
2009              64.0                                94.0   
2010              30.0                               106.0   
2011              33.0                                97.0   
2012              33.0                                92.0   
2013              97.0                               108.0   
2014             105.0                               110.0   
2015             102.0                                99.0   
2016              74.0                                90.0   
2017              69.0                                79.0   
      Total Mobile Subscriptions  
year                              
2001                   2877017.0  
2002                   3067033.0  
2003                   3358817.0  
2004                   3675142.0  
2005                   4090633.0  
2006                   4391733.0  
2007                   5073833.0  
2008                   6112742.0  
2009                   6576875.0  
2010                   7058117.0  
2011                   7540733.0  
2012                   7868608.0  
2013                   8235317.0  
2014                   8273658.0  
2015                   8140783.0  
2016                   8312475.0  
2017                   8427542.0

我试图制作一个 seaborn.lmplot，横坐标为 ['Private Sector Expenditure', 'Public Sector Expenditure']，纵坐标为 ['Content Services Revenue', 'Hardware Revenue', 'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue']，其中每行的列被求和，以返回每年的一个值。

filled_revenue = df_final.groupby(sum(['Private Sector Expenditure', 'Public Sector Expenditure']))
filled_expenditure = df_final.groupby(sum(['Content Services Revenue', 'Hardware Revenue', 
                                       'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue']))
sns.lmplot(data=df_final, x=filled_expenditure, y=filled_revenue)

我尝试这样做，但显然有些问题，我没有足够的经验来理解如何对数据进行逐行子集和求和。

英文:

Basically, my dataset looks like this.

      Private Sector Expenditure  Public Sector Expenditure  \
year                                                          
2001                     20.4502                    11.8767   
2002                     20.5501                    13.1333   
2003                     20.5362                    13.4328   
2004                     25.6956                    14.7190   
2005                     25.6956                    15.5087   
2006                     32.8184                    17.1671   
2007                     42.2216                    21.0410   
2008                     51.0546                    21.0410   
2009                     36.9461                    23.1826   
2010                     37.7380                    25.4141   
2011                     44.5643                    28.1917   
2012                     42.4928                    28.2885   
2013                     43.3318                    30.6922   
2014                     50.0689                    33.0973   
2015                     55.1194                    37.2753   
2016                     53.4095                    37.9928   
2017                     53.8613                    36.7543   
Content Services Revenue  Hardware Revenue  IT Services Revenue  \
year                                                                    
2001                      13.2              29.8                 32.0   
2002                      16.0              27.4                 36.5   
2003                       9.0              61.7                 25.6   
2004                      12.0              61.1                 26.8   
2005                       6.3              64.9                 25.4   
2006                       6.3              41.5                 29.1   
2007                       9.8              52.1                 44.2   
2008                      10.9              61.6                 62.4   
2009                      13.0             161.0                 71.0   
2010                      15.0             137.0                 75.0   
2011                      22.0             139.0                 67.0   
2012                      15.0             139.0                 75.0   
2013                      19.0             159.0                 75.0   
2014                      21.0             170.0                100.0   
2015                      21.0             205.0                102.0   
2016                      17.0             193.0                106.0   
2017                       0.0             188.0                207.0   
Software Revenue  Telecommunication Services Revenue  \
year                                                         
2001               9.0                                58.5   
2002              10.2                                60.7   
2003              37.6                                16.6   
2004              32.8                                16.4   
2005              45.9                                15.8   
2006              16.8                                54.9   
2007              16.9                                58.3   
2008              21.3                                72.0   
2009              64.0                                94.0   
2010              30.0                               106.0   
2011              33.0                                97.0   
2012              33.0                                92.0   
2013              97.0                               108.0   
2014             105.0                               110.0   
2015             102.0                                99.0   
2016              74.0                                90.0   
2017              69.0                                79.0   
Total Mobile Subscriptions  
year                              
2001                   2877017.0  
2002                   3067033.0  
2003                   3358817.0  
2004                   3675142.0  
2005                   4090633.0  
2006                   4391733.0  
2007                   5073833.0  
2008                   6112742.0  
2009                   6576875.0  
2010                   7058117.0  
2011                   7540733.0  
2012                   7868608.0  
2013                   8235317.0  
2014                   8273658.0  
2015                   8140783.0  
2016                   8312475.0  
2017                   8427542.0

I am trying to make a seaborn.lmplot of ['Private Sector Expenditure', 'Public Sector Expenditure'] on the x-axis and ['Content Services Revenue', 'Hardware Revenue', 'IT Services Revenue', 'Software Revenue', 'Telecommunication Services Revenue'] on the y-axis where the columns are summed up every row to return one value for each year on the x and y axis.

filled_revenue = df_final.groupby(sum([&#39;Private Sector Expenditure&#39;, &#39;Public Sector Expenditure&#39;]))
filled_expenditure = df_final.groupby(sum([&#39;Content Services Revenue&#39;, &#39;Hardware Revenue&#39;, 
&#39;IT Services Revenue&#39;, &#39;Software Revenue&#39;, &#39;Telecommunication Services Revenue&#39;]))
sns.lmplot(data = df_final, x = filled_expenditure, y = filled_revenue)

i tried doing this but clearly there's something wrong and i'm not experienced enough to understand how to subset and sum the data per row

答案1

得分: 1

我认为您将尝试将**"私营部门支出"，"公共部门支出" => X轴和"内容服务收入"，"硬件收入"，"IT服务收入"，"软件收入"，"电信服务收入" => Y轴**进行求和。

因此，您的最终代码将是：

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({
  "Private Sector Expenditure": [20.4502, 20.5501, 20.5362, 25.6956, 25.6956, 32.8184, 42.2216, 51.0546, 36.9461, 37.7380, 44.5643, 42.4928, 43.3318, 50.0689, 55.1194, 53.4095, 53.8613],
  "Public Sector Expenditure": [11.8767, 13.1333, 13.4328, 14.7190, 15.5087, 17.1671, 21.0410, 21.0410, 23.1826, 25.4141, 28.1917, 28.2885, 30.6922, 33.0973, 37.2753, 37.9928, 36.7543],
  "Content Services Revenue": [13.2, 16.0, 9.0, 12.0, 6.3, 6.3, 9.8, 10.9, 13.0, 15.0, 22.0, 15.0, 19.0, 21.0, 21.0, 17.0, 0.0],
  "Hardware Revenue": [29.8, 27.4, 61.7, 61.1, 64.9, 41.5, 52.1, 61.6, 161.0, 137.0, 139.0, 139.0, 159.0, 170.0, 205.0, 193.0, 188.0],
  "IT Services Revenue": [32.0, 36.5, 25.6, 26.8, 25.4, 29.1, 44.2, 62.4, 71.0, 75.0, 67.0, 75.0, 75.0, 100.0, 102.0, 106.0, 207.0],
  "Software Revenue": [9.0, 10.2, 37.6, 32.8, 45.9, 16.8, 16.9, 21.3, 64.0, 30.0, 33.0, 33.0, 97.0, 105.0, 102.0, 74.0, 69.0],
  "Telecommunication Services Revenue": [58.5, 60.7, 16.6, 16.4, 15.8, 54.9, 58.3, 72.0, 94.0, 106.0, 97.0, 92.0, 108.0, 110.0, 99.0, 90.0, 79.0],
  "Total Mobile Subscriptions": [2877017.0, 3067033.0, 3358817.0, 3675142.0, 4090633.0, 4391733.0, 5073833.0, 6112742.0, 6576875.0, 7058117.0, 7540733.0, 7868608.0, 8235317.0, 8273658.0, 8140783.0, 8312475.0, 8427542.0]
}, index=[2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017])
df["Expenditure"] = df[["Private Sector Expenditure", "Public Sector Expenditure"]].sum(axis=1)
df["Revenue"] = df[["Content Services Revenue", "Hardware Revenue", "IT Services Revenue", "Software Revenue", "Telecommunication Services Revenue"]].sum(axis=1)
sns.lmplot(data=df, x="Expenditure", y="Revenue")
plt.show()

结果将是：

英文:

I think you will try to sum "Private Sector Expenditure", "Public Sector Expenditure" => X-axis and "Content Services Revenue", "Hardware Revenue", "IT Services Revenue", "Software Revenue", "Telecommunication Services Revenue" => Y-axis

So your final code will be for your data is :

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({
&quot;Private Sector Expenditure&quot;: [20.4502, 20.5501, 20.5362, 25.6956, 25.6956, 32.8184, 42.2216, 51.0546, 36.9461, 37.7380, 44.5643, 42.4928, 43.3318, 50.0689, 55.1194, 53.4095, 53.8613],
&quot;Public Sector Expenditure&quot;: [11.8767, 13.1333, 13.4328, 14.7190, 15.5087, 17.1671, 21.0410, 21.0410, 23.1826, 25.4141, 28.1917, 28.2885, 30.6922, 33.0973, 37.2753, 37.9928, 36.7543],
&quot;Content Services Revenue&quot;: [13.2, 16.0, 9.0, 12.0, 6.3, 6.3, 9.8, 10.9, 13.0, 15.0, 22.0, 15.0, 19.0, 21.0, 21.0, 17.0, 0.0],
&quot;Hardware Revenue&quot;: [29.8, 27.4, 61.7, 61.1, 64.9, 41.5, 52.1, 61.6, 161.0, 137.0, 139.0, 139.0, 159.0, 170.0, 205.0, 193.0, 188.0],
&quot;IT Services Revenue&quot;: [32.0, 36.5, 25.6, 26.8, 25.4, 29.1, 44.2, 62.4, 71.0, 75.0, 67.0, 75.0, 75.0, 100.0, 102.0, 106.0, 207.0],
&quot;Software Revenue&quot;: [9.0, 10.2, 37.6, 32.8, 45.9, 16.8, 16.9, 21.3, 64.0, 30.0, 33.0, 33.0, 97.0, 105.0, 102.0, 74.0, 69.0],
&quot;Telecommunication Services Revenue&quot;: [58.5, 60.7, 16.6, 16.4, 15.8, 54.9, 58.3, 72.0, 94.0, 106.0, 97.0, 92.0, 108.0, 110.0, 99.0, 90.0, 79.0],
&quot;Total Mobile Subscriptions&quot;: [2877017.0, 3067033.0, 3358817.0, 3675142.0, 4090633.0, 4391733.0, 5073833.0, 6112742.0, 6576875.0, 7058117.0, 7540733.0, 7868608.0, 8235317.0, 8273658.0, 8140783.0, 8312475.0, 8427542.0]
}, index=[2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017])
df[&quot;Expenditure&quot;] = df[[&quot;Private Sector Expenditure&quot;, &quot;Public Sector Expenditure&quot;]].sum(axis=1)
df[&quot;Revenue&quot;] = df[[&quot;Content Services Revenue&quot;, &quot;Hardware Revenue&quot;, &quot;IT Services Revenue&quot;, &quot;Software Revenue&quot;, &quot;Telecommunication Services Revenue&quot;]].sum(axis=1)
sns.lmplot(data=df, x=&quot;Expenditure&quot;, y=&quot;Revenue&quot;)
plt.show()

Result will be :

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用`sns.lmplot`创建散点图，分别对每一行的两组数据求和。

问题

答案1

生成一组沿一条线的新点

使用Python从列表中查找列中的值。

在一个pandas数据框中更改多列条目。

如何在分组的柱形图上显示自定义数值？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。