英文:
Python Pandas hypostesis: average rating for the "expensive" books. Need some help understatding the basic features of pandas
问题
我现在正在学习pandas,对于基本特性的理解存在问题。我正在探索这个数据集。有一个名为“Price (Above Average)”的变量,如果书的价格高于平均价格,则包含“Yes”,如果低于平均价格,则包含“No”。
我假设书的评分与其价格无关,并希望进行测试。现在我需要绘制每个组的平均用户评分的图表。
首先,我想打印“昂贵”书籍的平均评分,以弄清楚它的工作原理。我对语法还不太了解,所以希望得到您的帮助。
英文:
I'm studying pandas now and having issues in understanding of basic features. I'm exploring this data set. There's a variable "Price (Above Average)" that contains "Yes" if the price of the book is greater than the average, and "No" if it is less.
I assumed that a book's rating is independent of its price and want to test it. Now I need graphing the average user rating for each of the groups.
At first I want to print the average rating for the "expensive" books just to figure out how it works. I don't understand the syntax very well yet, so I'm hoping on your help.
答案1
得分: 0
打印书籍的平均评分:
df['average_rating_for_books'] = df.groupby(['Price (Above Average)'])['User Rating (Round)'].transform('mean')
之后,您可以筛选出价格较高的书籍。
要筛选出行,您可以编写一个类似的函数:
df[df['Price (Above Average)'] == 'Yes']
英文:
To print the average rating for the books:
df['average_rating_for_books'] = df.groupby(['Price (Above Average)'])['User Rating (Round)'].transform('mean')
After this, you can filter out books which are expensive.
To filter out rows, you can write a function like:
df[df['Price (Above Average)'] == 'Yes']
答案2
得分: 0
假设您的数据存储在名为df的数据框中:
df.groupby("Price (Above Average)").agg(avg_rating=("User Rating", "mean"))
groupby
函数会根据您传递的列中的不同值对数据进行分组,这里是根据 "Price (Above Average)" 列的不同值进行分组。
agg
函数会在 groupby
创建的分组上进行聚合操作。在这里,您正在使用 "User Rating" 列创建一个名为 "avg_rating" 的新列,并计算其平均值。
这将显示出“昂贵”书籍和其他书籍的平均评分。
英文:
Suppose your data is in df:
df.groupby("Price (Above Average)").agg(avg_raiting=("User Rating", "mean"))
The groupby
function groups your data by each of the different values in the column you pass, in this case, "Price (Above Average)".
The agg
function aggregates a column over the groups created in the groupby
. Here, you are creating a new column named avg_raiting
using the column "User Rating", and calculating the "mean".
This will show you the average rating for the "expensive" books and others.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论