使用Pandas中的DataFrame进行groupby操作。

huangapple go评论72阅读模式
英文:

using groupby on DataFrame in Pandas

问题

import pandas as pd
df = pd.read_csv(r"C:\Users\Ouis AL-Hetar\Documents\TestEmployeeTable1.csv")
sal = df.groupby("Department")["Salary"].sum().reset_index()
sal.columns = ["Department", "Sum_of_salary"]

print(sal)
英文:
import pandas as pd
df = pd.read_csv(r"C:\Users\Ouis AL-Hetar\Documents\TestEmployeeTable1.csv")
sal= df.groupby("Department").sum("Salary").reset_index()
sal.columns=["Dapartment","Sum_of_salary"]

print(sal)

when i treid run this code ir raise an Error:
enter image description here
enter image description here

i have tried print head() for checking if there is any errors in the names of columns :
enter image description here
but i note any error

i hope someone who knows what's the problem help me ,
sorry for my discusting English

答案1

得分: 1

CSV文件的默认分隔符是“,”。在您的情况下,分隔符似乎是分号而不是逗号,因此您需要将sep=";"作为pd.read_csv的参数来正确读取您的文件:

#                                        在这里 --v
df = pd.read_csv("TestEmployeeTable1.csv", sep=";")

但是,您需要修改您的其余代码:

sal = df.groupby("Department", as_index=False)["Salary"].sum()
sal.columns = ["Department", "Sum_of_salary"]

# 或者

sal = (df.groupby("Department", as_index=False)
         .agg(Sum_of_salary=("Salary", "sum")))
英文:

The default separator of CSV file is ,. In your case, it seems the separator is a semicolon and not a comma so you need to set sep=";" as parameter of pd.read_csv to correctly read your file:

#                                        HERE --v
df = pd.read_csv("TestEmployeeTable1.csv", sep=";")

However, you have to modify the rest of your code:

sal = df.groupby("Department", as_index=False)["Salary"].sum()
sal.columns = ["Department", "Sum_of_salary"]

# OR

sal = (df.groupby("Department", as_index=False)
         .agg(Sum_of_salary=("Salary", "sum")))

答案2

得分: 0

pandas.DataFrame.groupby()方法与一般的DataFrame方法略有不同,因为groupby方法不会直接返回一个DataFrame或Series,这意味着它允许我们在抽象意义上将DataFrame拆分为组,但实际上并没有进行任何计算,直到在Groupby对象上调用函数。

另外要记住,groupby函数遵循(拆分-应用-合并)的过程:拆分DataFrame-应用函数-合并结果。
另外,通过groupby调用返回Groupby对象。
我认为,与其直接使用head()函数,不如使用:DataFrameGroupBy.head([n]):返回每个组的前n行。

英文:

pandas.DataFrame.groupby()method is little different from general dataframe methods , As groupby method doesnot give a DataFrame or Series in return directly meaning it allows us to split the dataframe into groups but only in an abstract sense.Nothing really get computed until a function is called on Groupby object.

Also remember a groupby function follows (split-apply-combine): Split the dataframe-apply the function-combine the result.
Also Groupby objects are returned by groupby calls
I think rather than using head() function directly
Use: DataFrameGroupBy.head([n]):Return first n rows of each group.

huangapple
  • 本文由 发表于 2023年6月29日 11:07:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76577835.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定