2020年1月4日 01:14:10go评论95阅读模式

英文:

Turning bars to a normal distribution

问题

我是新手学习Python。

我有两个数组和一个漂亮的柱状图：

# 买家的百分比
h = [1, 1, 3, 5, 9, 13, 16, 16, 14, 10, 5, 4, 2, 1, 0]
# 服装尺码
x = [34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48]
# P(X=40) = 16% // 有些买家购买40号尺码的概率为16%
# P(37 <= X <= 40) = 5+9+13+16 = 43% // 有些买家购买尺码在37到40之间的概率为43%
plt.ylabel('买家百分比')
plt.xlabel('服装尺码')
plt.bar(x, height=h)
plt.grid(True)
plt.show()

我该如何使用seaborn或scipy.stats.norm将其转换为密度曲线和正态分布，并在柱状图上绘制它？
之后，我如何使用正态分布来计算P(X<40)？

谢谢。

英文:

I'm new to python .

I have 2 arrays, and a nice bars graph :

# Buyers in %
h =[1,1,3,5,9,13,16,16,14,10,5,4,2,1,0]
# Clothes size
x =  [34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]
# P(X=40) =  16 % // The probability that some buyers gets a 40 sized clothe is 16 %
# P(37 &lt;= X &lt;= 40)  = 5+9+13+16 = 43 % // The probability that somes buyers gets between 37 and 40 sized clothes is 43 %
plt.ylabel(&#39;Buyers % &#39;)
plt.xlabel(&#39;Clothes Size&#39;)
plt.bar(x, height = h)
plt.grid(True)
plt.show()

How could I turn that to a density line and a normal distribution using seaborn or scipy.stats.norm and draw it over the bars ?
After that , How could I calculate P(X<40) using the normal distribution ?

Thank you.

答案1

得分: 1

使用seaborn：

# 买家比例
h = [1, 1, 3, 5, 9, 13, 16, 16, 14, 10, 5, 4, 2, 1, 0]
# 服装尺寸
x = [34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48]
import seaborn as sns
from scipy.stats import norm
data = []
for i in range(len(x)): data += [x[i]] * h[i] 
sns.set()
plt.figure(figsize=(10,5), dpi=300)
sns.distplot(data, fit=norm, kde=False)

要获取概率：

from scipy.stats import norm
import numpy as np
sample = data
sample_mean = np.array(data).mean()
sample_std = np.array(data).std()
min_value = int(sample_mean - 4 * sample_std)
max_value = int(sample_mean + 4 * sample_std)
dist = norm(sample_mean, sample_std)
values = [value for value in range(min_value, max_value)]
probabilities = [dist.pdf(value) for value in values]
# plt.plot(values, probabilities)
def prob(min_lim, max_lim):
    p = (np.array(values) > min_lim).astype(int) * (np.array(values) < max_lim).astype(int)
    prob = (np.array(probabilities)).sum()
    return prob
prob(0, 40)
Out[2]: 0.3230891372830226

注意：这与计算得到的值不同，因为它使用了从数据的均值和标准差连续估计的正态分布。

如果您不想使用连续估计，代码只需为：

len(np.array(data)[np.array(data) < 40]) / len(data)
Out[2]: 0.32

英文:

Using seaborn:

# Buyers in %
h =[1,1,3,5,9,13,16,16,14,10,5,4,2,1,0]
# Clothes size
x =  [34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]
import seaborn as sns
from scipy.stats import norm
data = []
for i in range(len(x)): data += [x[i]]*h[i] 
sns.set()
plt.figure(figsize=(10,5),dpi=300)
sns.distplot(data, fit=norm, kde=False)

To get the probability:

from scipy.stats import norm
import numpy as np
sample = data
sample_mean = np.array(data).mean()
sample_std = np.array(data).std()
min_value = int(sample_mean-4*sample_std)
max_value = int(sample_mean+4*sample_std)
dist = norm(sample_mean, sample_std)
values = [value for value in range(min_value, max_value)]
probabilities = [dist.pdf(value) for value in values]
#plt.plot(values,probabilities)
def prob(min_lim,max_lim):
    p = (np.array(values)&gt;min_lim).astype(int)* (np.array(values)&lt;max_lim).astype(int)
    prob = (np.array(probabilities)).sum()
    return prob
prob(0,40)
Out[2]: 0.3230891372830226

NOTE: it's different from the calculated one because it's using a continuous estimated normal distribution from the mean and standard deviation of your data.

If you don't want to use the continuous estimation, the code is just:

len(np.array(data)[np.array(data)&lt;40])/len(data)
Out[2]: 0.32

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将条形图转换为正态分布

问题

答案1

使用已缓存的属性在一个命名元组上。

Extracting data in the same cell locations from multiple excel files into one single excel file

如何在每隔一次迭代中反转子循环的索引顺序？

如何在Python中获取元组列表中第一个元素的计数和第二个元素的总和？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。