英文:
How to fit a Gaussian best fit for the data
问题
我有一个数据集,我正在绘制一个特定频率的时间与强度之间的图形。
在x轴上是时间数据集,它是一个numpy数组,而在y轴上是强度数组。
time = [ 0.3 1.3 2.3 3.3 4.3 5.3 6.3 7.3 8.3 9.3 10.3 11.3 12.3 13.3
14.3 15.3 16.3 17.3 18.3 19.3 20.3 21.3 22.3 23.3 24.3 25.3 26.3 27.3
28.3 29.3 30.3 31.3 32.3 33.3 34.3 35.3 36.3 37.3 38.3 39.3 40.3 41.3
42.3 43.3 44.3 45.3 46.3 47.3 48.3 49.3 50.3 51.3 52.3 53.3 54.3 55.3
56.3 57.3 58.3 59.3]
intensity = [1.03587, 1.03187, 1.03561, 1.02893, 1.04659, 1.03633, 1.0481 ,
1.04156, 1.02164, 1.02741, 1.02675, 1.03651, 1.03713, 1.0252 ,
1.02853, 1.0378 , 1.04374, 1.01427, 1.0387 , 1.03389, 1.03148,
1.04334, 1.042 , 1.04154, 1.0161 , 1.0469 , 1.03152, 1.22406,
5.4362 , 7.92132, 6.50259, 4.7227 , 3.32571, 2.46484, 1.74615,
1.51446, 1.2711 , 1.15098, 1.09623, 1.0697 , 1.06085, 1.05837,
1.04151, 1.0358 , 1.03574, 1.05095, 1.03382, 1.04629, 1.03636,
1.03219, 1.03555, 1.02886, 1.04652, 1.02617, 1.04363, 1.03591,
1.04199, 1.03726, 1.03246, 1.0408 ]
当我使用matplotlib绘制时,使用以下代码:
plt.figure(figsize=(15,6))
plt.title('Single frequency graph at 636 kHz', fontsize=18)
plt.plot(time, intensity)
plt.xticks(time[::3], fontsize=12)
plt.yticks(fontsize=12)
plt.xlabel('Elapsed time (minutes:seconds)', fontsize=18)
plt.ylabel('Intensity at 1020 kHz', fontsize=18)
plt.savefig('WIND_Single_frequency_graph_1020_kHz')
plt.show()
图形如下所示:
我想对这些数据进行高斯拟合,这是我使用的代码:
def Gauss(x, A, B):
y = A * np.exp(-1 * B * x**2)
return y
parameters, covariance = curve_fit(Gauss, time, intensity)
fit_A = parameters[0]
fit_B = parameters[1]
fit_y = Gauss(time, fit_A, fit_B)
plt.figure(figsize=(15,6))
plt.plot(time, intensity, 'o', label='data')
plt.plot(time, fit_y, '-', label='fit')
plt.legend()
我得到的最佳拟合看起来像这样:
我错在哪里?如何使最佳拟合曲线更好地拟合数据?
英文:
I have a data set for which I am plotting a graph of time vs intensity at a particular frequency.
On the x axis is the time data set which is in a numpy array and on the y axis is the intensity array.
time = [ 0.3 1.3 2.3 3.3 4.3 5.3 6.3 7.3 8.3 9.3 10.3 11.3 12.3 13.3
14.3 15.3 16.3 17.3 18.3 19.3 20.3 21.3 22.3 23.3 24.3 25.3 26.3 27.3
28.3 29.3 30.3 31.3 32.3 33.3 34.3 35.3 36.3 37.3 38.3 39.3 40.3 41.3
42.3 43.3 44.3 45.3 46.3 47.3 48.3 49.3 50.3 51.3 52.3 53.3 54.3 55.3
56.3 57.3 58.3 59.3]
intensity = [1.03587, 1.03187, 1.03561, 1.02893, 1.04659, 1.03633, 1.0481 ,
1.04156, 1.02164, 1.02741, 1.02675, 1.03651, 1.03713, 1.0252 ,
1.02853, 1.0378 , 1.04374, 1.01427, 1.0387 , 1.03389, 1.03148,
1.04334, 1.042 , 1.04154, 1.0161 , 1.0469 , 1.03152, 1.22406,
5.4362 , 7.92132, 6.50259, 4.7227 , 3.32571, 2.46484, 1.74615,
1.51446, 1.2711 , 1.15098, 1.09623, 1.0697 , 1.06085, 1.05837,
1.04151, 1.0358 , 1.03574, 1.05095, 1.03382, 1.04629, 1.03636,
1.03219, 1.03555, 1.02886, 1.04652, 1.02617, 1.04363, 1.03591,
1.04199, 1.03726, 1.03246, 1.0408 ]
When I plot this using matplotlib, using;
plt.figure(figsize=(15,6))
plt.title('Single frequency graph at 636 kHz', fontsize=18)
plt.plot(time,intensity)
plt.xticks(time[::3], fontsize=12)
plt.yticks(fontsize=12)
plt.xlabel('Elapsed time (minutes:seconds)', fontsize=18)
plt.ylabel('Intensity at 1020 kHz', fontsize=18)
plt.savefig('WIND_Single_frequency_graph_1020_kHz')
plt.show()
The graph looks like -
I want to fit a gaussian curve for this data, and this is the code I used,
def Gauss(x, A, B):
y = A*np.exp(-1*B*x**2)
return y
parameters, covariance = curve_fit(Gauss, time, intensity_636)
fit_A = parameters[0]
fit_B = parameters[1]
fit_y = Gauss(time, fit_A, fit_B)
plt.figure(figsize=(15,6))
plt.plot(time, intensity, 'o', label = 'data')
plt.plot(time, fit_y, '-', label ='fit')
plt.legend()
And the best fit i obtain looks like this -
Where am I going wrong? How can I make the best fit curve fit the data better?
答案1
得分: 1
通过检查,可以看出曲线不对称。这在接近峰值附近的对数y轴缩放范围内更加明显。
这使人认为高斯模型(对称的)可能无法正确拟合。
此外,可以观察到曲线在峰值附近的部分几乎是线性的。因此,一个更好的模型可能由两个指数函数的组合构成,例如:
我假设您可以在您的非线性回归软件中编写这个函数。上面的参数的粗略值可以用作迭代计算的起始值。
英文:
By inspection one obweve that the curve isn't symmetrical. This is even more visible in logarithmic y-scale zoomed in the range close to the peak.
This draw to think that the Gaussian model (which is symetrical) cannot be fitted correctly.
Also one observe that the part of the curve around the peak isn't far from to be linear. Thus a better model might be made of the combination of two exponential functions, for example :
I suppose that you can code this function in your nonlinear regression sofware. The above rough values of the parameters can be used as starting values for the iterative calculus.
答案2
得分: 0
你需要定义一个更灵活的模型(拥有更多参数),并为它们定义合理的初始值:
def f(x, a, b, mu, sigma):
return a + b * np.exp(-(x - mu) ** 2 / (2 * sigma ** 2))
popt, pcov = curve_fit(f, time, intensity, p0=[1, 1, 30.3, 2])
plt.plot(time, intensity)
plt.plot(time, f(time, *popt))
plt.show()
英文:
You need to define a more flexible model (more parameters) and to define reasonable initial values for them:
def f(x, a, b, mu, sigma):
return a + b * np.exp(-(x - mu) ** 2 / (2 * sigma ** 2))
popt, pcov = curve_fit(f, time, intensity, p0=[1, 1, 30.3, 2])
plt.plot(time, intensity)
plt.plot(time, f(time, *popt))
plt.show()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论