无法使用Python的scipy在Jupyterlab中将数据拟合到给定函数。

huangapple go评论74阅读模式
英文:

Can't fit data to given function with scipy on python (Jupyterlab)

问题

问题在于我需要将一些数据拟合到一个函数中(由我的教授提供,所以基本上我不能更改它),但拟合效果不佳,我不明白为什么。我还尝试了lmfit方法,但也没有成功。以下是我使用的代码以及拟合结果的图像:Fit

x1 = np.array([200.  , 220.  , 240.  , 260.  , 280.  , 300.  , 320.  , 340.  ,
       360.  , 380.  , 390.  , 392.5 , 395.  , 397.5 , 400.  , 402.5 ,
       405.  , 407.5 , 410.  , 411.25, 412.5 , 413.75, 414.  , 415.25,
       416.5 , 417.75, 418.5 , 420.  , 422.5 , 425.  , 427.5 , 430.  ,
       432.5 , 435.  , 437.5 , 440.  , 450.  , 460.  , 480.  , 500.  ,
       510.  , 520.  , 540.  , 560.  , 570.  , 580.  , 600.  ])

y1 = np.array([0.00787402, 0.00904762, 0.01079365, 0.01269841, 0.01539683,
       0.01857143, 0.02321429, 0.0302    , 0.04303279, 0.06767896,
       0.09456265, 0.10338983, 0.11410579, 0.12545932, 0.14700855,
       0.16944444, 0.19692833, 0.22965779, 0.27434783, 0.28681818,
       0.32079208, 0.32562814, 0.32893401, 0.33061224, 0.32727273,
       0.31268293, 0.30236967, 0.28940092, 0.24274194, 0.20989011,
       0.18052805, 0.15531915, 0.13259053, 0.11627907, 0.10341463,
       0.07345133, 0.07455357, 0.06403509, 0.04163265, 0.03811475,
       0.02948207, 0.0272    , 0.02301587, 0.02027559, 0.01968254,
       0.01795276, 0.01672584])

σ_y1 = np.array([0.00021564, 0.00023222, 0.00025707, 0.00028599, 0.00032924,
       0.00038243, 0.00046297, 0.00058876, 0.0008021 , 0.00127062,
       0.00180749, 0.00196919, 0.00217276, 0.00239208, 0.00284333,
       0.00317296, 0.00368032, 0.00432658, 0.00507593, 0.00532676,
       0.0060111 , 0.00611128, 0.00618   , 0.00621495, 0.00607953,
       0.0057964 , 0.0056074 , 0.00536185, 0.00463023, 0.00401191,
       0.00347255, 0.00303842, 0.00250275, 0.00219007, 0.00196289,
       0.0014738 , 0.00139797, 0.00129955, 0.00077378, 0.00082702,
       0.00054713, 0.00053481, 0.00043752, 0.00041071, 0.0003831 ,
       0.00037095, 0.00033556])

# 请注意,以下是函数的代码部分,不需要翻译
# ...

# 请注意,以下是函数的代码部分,不需要翻译
# ...

您提到了一些尝试,如删除参数的边界,更改参数的名称以避免与初始值的名称冲突,以及调整初始值,但仍然遇到问题。R_tot-R_G 表示RLC电路的电阻,L 表示电感,C 表示电容。唯一未知的值是曲线的振幅 A。感谢您提前的帮助!

英文:

The problem is that I need to fit some data into a function (which was given by my professor, so basically I can't change it) but the fit goes poorly and I can't understand why. I also tried with lmfit method but that didn't work too. Here is the code I used and the image of the fit I get with residuals:Fit

x1 = np.array([200.  , 220.  , 240.  , 260.  , 280.  , 300.  , 320.  , 340.  ,
360.  , 380.  , 390.  , 392.5 , 395.  , 397.5 , 400.  , 402.5 ,
405.  , 407.5 , 410.  , 411.25, 412.5 , 413.75, 414.  , 415.25,
416.5 , 417.75, 418.5 , 420.  , 422.5 , 425.  , 427.5 , 430.  ,
432.5 , 435.  , 437.5 , 440.  , 450.  , 460.  , 480.  , 500.  ,
510.  , 520.  , 540.  , 560.  , 570.  , 580.  , 600.  ])
y1 = np.array([0.00787402, 0.00904762, 0.01079365, 0.01269841, 0.01539683,
0.01857143, 0.02321429, 0.0302    , 0.04303279, 0.06767896,
0.09456265, 0.10338983, 0.11410579, 0.12545932, 0.14700855,
0.16944444, 0.19692833, 0.22965779, 0.27434783, 0.28681818,
0.32079208, 0.32562814, 0.32893401, 0.33061224, 0.32727273,
0.31268293, 0.30236967, 0.28940092, 0.24274194, 0.20989011,
0.18052805, 0.15531915, 0.13259053, 0.11627907, 0.10341463,
0.07345133, 0.07455357, 0.06403509, 0.04163265, 0.03811475,
0.02948207, 0.0272    , 0.02301587, 0.02027559, 0.01968254,
0.01795276, 0.01672584])
σ_y1 = np.array([0.00021564, 0.00023222, 0.00025707, 0.00028599, 0.00032924,
0.00038243, 0.00046297, 0.00058876, 0.0008021 , 0.00127062,
0.00180749, 0.00196919, 0.00217276, 0.00239208, 0.00284333,
0.00317296, 0.00368032, 0.00432658, 0.00507593, 0.00532676,
0.0060111 , 0.00611128, 0.00618   , 0.00621495, 0.00607953,
0.0057964 , 0.0056074 , 0.00536185, 0.00463023, 0.00401191,
0.00347255, 0.00303842, 0.00250275, 0.00219007, 0.00196289,
0.0014738 , 0.00139797, 0.00129955, 0.00077378, 0.00082702,
0.00054713, 0.00053481, 0.00043752, 0.00041071, 0.0003831 ,
0.00037095, 0.00033556])
# Primo test di fit con 4 parametri
def fitfunc1(x, A, R, L, C, Off): # le frequenze 'f' sono in kHz
ω = 2*np.pi*x 
δ = R/(2*L)
Ω = 1/sqrt(L*C)
return A*ω/sqrt(ω**4-2*ω**2*(Ω**2-2*δ**2)+Ω**4)
R_tot = 77.54
R_G = 48
L = 308*1E-6
C = 508*1E-9
p_init1 = [31454, R_tot-R_G, L, C] # valori iniziali di (A, R, L, C)
p_best1, cov1 = curve_fit(
fitfunc1, x1, y1,          # assegno funzione di fit, ascisse e ordinate
sigma=σ_y1,                     # assegno gli errori sulle ordinate
p0=p_init1, bounds=(0, +np.inf)  # imposto i valori iniziali dei parametri e intervalli ammessi [0, +∞) 
)                                 
#Somma dei residui e deviazione standard a posteriori
res1 = (y1-fitfunc1(x1, *p_best1))
sum_res1 = sum(res1)
σ_post1 = ma.sqrt(sum_res1/(y1.size))
print("---------------------------")
print("Best fit values:")
print("---------------------------")
print("A = (%.5f +/- %.5f)   err_percentuale = %.5f" % (p_best1[0],sqrt(cov1[0,0]),sqrt(cov1[0,0])*100/p_best1[0]))
print("R = (%.5f +/- %.5f) Ω err_percentuale = %.5f" % (p_best1[1],sqrt(cov1[1,1]),sqrt(cov1[1,1])*100/p_best1[1]))
print("L = (%.5f +/- %.5f) μH err_percentuale = %.5f" % (p_best1[2]*1E+6,sqrt(cov1[2,2])*1E+6,sqrt(cov1[2,2])*100/p_best1[2]))
print("C = (%.5f +/- %.5f) F err_percentuale = %.5f" % (p_best1[3]*1E+12,sqrt(cov1[3,3])*1E+12,sqrt(cov1[3,3])*100/p_best1[3]))
print("---------------------------")
fig1 = plt.figure(1, figsize=(9,7))
frame1a=fig1.add_axes((.1,.3,.7,.5))
# Grafico del fit
plt.errorbar(x1, y1, yerr=σ_y1, fmt='.', label='dati', capsize = 4)
_pts = np.linspace(x1[0], x1[-1], 1000)
plt.plot(_pts, fitfunc1(_pts, *p_best1), label='fit')
plt.title("Amplificazione ai capi di R")
plt.xlabel("frequenza [kHz]")
plt.ylabel(r"A = $V_{out}/V_{in}$")
plt.legend()
plt.tick_params(bottom = False)
plt.grid()
frame1a.set_xticklabels([])
# Grafico dei residui
frame1b=fig1.add_axes((.1,.1,.7,.19))
plt.errorbar(x1, res1, yerr = σ_y1, fmt ='.', capsize = 4)
plt.xlabel("frequenza [kHz]")
plt.ylabel(r"$A_{mis}-A_{fit}$")
plt.grid()
#salvataggio della file nella cartella, dpi = risoluzione immagine, bbox_inches = risolve problema di dimensione
#plt.savefig("name_of_file", dpi = 200, bbox_inches = "tight")

I tried several things like removing the bounds for the parameters but that gives me enormous errors, renaming the parameters (in case they conflicted with the name of the initial value) and varying the initial values to let the fit be more precise (however initial parameters should be fixed as I wrote them because those are values that were measured with particular devices). (R_tot-R_G) represents the resistance of our RLC circuit, L is the inductance, and C is the capacity. The only unknown value is the amplitude A of the curve. Thank you in advance!

答案1

得分: 2

以下是要翻译的内容:

一个获得参数非常好近似值的简单方法:

现在,如果您想根据一些拟合标准(LMSE、LMSRE、LMAE或其他标准)获得更准确的结果,您可以使用上述参数的近似值作为起始值,开始一个非线性回归的迭代方法(使用Python或任何其他方便的软件)。

英文:

A simple method to get very good approximates of the parameters :

无法使用Python的scipy在Jupyterlab中将数据拟合到给定函数。

Now if you want more accurate results according to some criteria of fitting (LMSE, LMSRE, LMAE or other) you can use the above approximates values of parameters as initial values to start an iterative method of non-linear regression (with python or any other convenient software).

答案2

得分: 1

I think that your script is a very good start, but there are two separate issues with your fit. First, I think that you will find that for a single oscillator (and, just your mathematical function) that R, L, and C will not be able to be all determined independently -- they are almost completely correlated. Fixing one of these values will allow the other two to be determined well.

Second, the initial values you used for R, L, and C are so far from optimal that the fit simply cannot find a better solution. I think the simplest way to view that is that your initial value of L is too low by a factor of 1000.

Initial values always matter for a fit. A related topic (or perhaps a third topic) is that you should put lower bounds of 0 on all of R, L, and C -- that's physically justifiable but also avoids problems with taking the square root of a negative number.

Using lmfit (as I am a lead author -- I'm sure this could be done with curve_fit and give nearly identical results), a fit allowing all of R, L, and C, as well as A and Off (added to your function) would look like:

import numpy as np
import matplotlib.pyplot as plt
import lmfit

def fitfunc1(x, A, R, L, C, Off): # le frequenze 'f' sono in kHz
    ω = 2*np.pi*x
    δ = R/(2*L)
    Ω = 1/np.sqrt(L*C)
    return Off + A*ω/np.sqrt(ω**4-2*ω**2***2 - 2*δ**2)+Ω**4)

model = lmfit.Model(fitfunc1)

params = model.make_params(A=40, R=30, L=0.30, C=0.5e-6, Off=0)
params['A'].min = 0
params['R'].min = 0
params['L'].min = 0
params['C'].min = 0
# params['L'].vary = False

init = model.eval(params, x=x1)

result = model.fit(y1, params, x=x1, weights=1/σ_y1)

print(result.fit_report())

# #Somma dei residui e deviazione standard a posteriori
res1 = y1 - result.best_fit

fig1 = plt.figure(1, figsize=(9,7))
frame1a=fig1.add_axes((.1,.3,.7,.5))
#
# Grafico del fit
plt.errorbar(x1, y1, yerr=σ_y1, fmt='.', label='dati', capsize = 4)
plt.plot(x1, result.best_fit, 'o', label='best fit')
_pts = np.linspace(x1[0], x1[-1], 1000)
plt.plot(_pts, result.eval(x=_pts), label='interpolated fit')
plt.title("Amplificazione ai capi di R")
plt.xlabel("frequenza [kHz]")
plt.ylabel(r"A = $V_{out}/V_{in}$")
plt.legend()
plt.tick_params(bottom = False)
plt.grid(color='#EEE')
frame1a.set_xticklabels([])
#
# ....

Would give a fit report of:

[[Model]]
    Model(fitfunc1)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 65
    # data points      = 47
    # variables        = 5
    chi-square         = 400.284733
    reduced chi-square = 9.53058888
    Akaike info crit   = 110.675341
    Bayesian info crit = 119.926079
    R-squared          = -673.477539
[[Variables]]
    A:    30.9354753 +/- 0.67812562 (2.19%) (init = 40)
    R:    23.3162313 +/- 1003.45159 (4303.66%) (init = 30)
    L:    0.25082189 +/- 10.7859228 (4300.23%) (init = 0.3)
    C:    5.8633e-07 +/- 2.5214e-05 (4300.36%) (init = 5e-07)
    Off:  6.8886e-04 +/- 4.3907e-04 (63.74%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
    C(L, C)   = -1.0000
    C(R, L)   = +1.0000
    C(R, C)   = -1.0000
    C(A, Off) = -0.8021
    C(A, R)   = +0.5709

Showing huge uncertainties for R, L, and C because their correlations all exceed 0.999.

Simply fixing L (by uncommenting # params['L'].vary = False) would give a fit with the same value for chi-square and a report of:

[[Model]]
    Model(fitfunc1)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 26
    # data points      = 47
    # variables        = 4
    chi-square         = 400.284733
    reduced chi-square = 9.30894728
    Akaike info crit   = 108.675341
    Bayesian info crit = 116.075931
    R-squared          = -673.477539
[[Variables]]
    A:    30.9354748 +/- 0.55046897 (1.78%) (init = 40)
    R:    27.8877948 +/- 0.84442502 (3.03%) (init = 30)
    L:    0.3 (fixed)
    C:    4.9022e-07 +/- 5.3035e-10 (0.11%) (init = 5e-07)
    Off:  6.8887e-04 +/- 3.9717e-04 (57.66%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
    C(A, R)   = +0.7666
    C(A, Off) = -0.7613
    C(R, Off) = -0.5502

FWIW, the plot using the above modifications of your code would look like:

无法使用Python的scipy在Jupyterlab中将数据拟合到给定函数。

英文:

I think that your script is a very good start, but there are two separate issues with your fit. First, I think that you will find that for a single oscillator (and, just your mathematical function) that R, L, and C will not be able to be all determined independently -- they are almost completely correlated. Fixing one of these values will allow the other two to be determined well.

Second, the initial values you used for R, L, and C are so far from optimal that the fit simply cannot find a better solution. I think the simplest way to view that is that your initial value of L is too low by a factor of 1000.

Initial values always matter for a fit. A related topic (or perhaps a third topic) is that you should put lower bounds of 0 on all of R, L, and C -- that's physically justifiable but also avoids problems with taking the square root of a negative number.

Using lmfit (as I am a lead author -- I'm sure this could be done with curve_fit and give nearly identical results), a fit allowing all of R, L, and C, as well as A and Off (added to your function) would look like:

import numpy as np
import matplotlib.pyplot as plt
import lmfit
def fitfunc1(x, A, R, L, C, Off): # le frequenze &#39;f&#39; sono in kHz
ω = 2*np.pi*x
δ = R/(2*L)
Ω = 1/np.sqrt(L*C)
return Off + A*ω/np.sqrt(ω**4-2*ω**2*(Ω**2 - 2*δ**2)+Ω**4)
model = lmfit.Model(fitfunc1)
params = model.make_params(A=40, R=30, L=0.30, C=0.5e-6, Off=0)
params[&#39;A&#39;].min = 0
params[&#39;R&#39;].min = 0
params[&#39;L&#39;].min = 0
params[&#39;C&#39;].min = 0
# params[&#39;L&#39;].vary = False
init = model.eval(params, x=x1)
result = model.fit(y1, params, x=x1, weights=1/σ_y1)
print(result.fit_report())
# #Somma dei residui e deviazione standard a posteriori
res1 = y1 - result.best_fit
fig1 = plt.figure(1, figsize=(9,7))
frame1a=fig1.add_axes((.1,.3,.7,.5))
#
# Grafico del fit
plt.errorbar(x1, y1, yerr=σ_y1, fmt=&#39;.&#39;, label=&#39;dati&#39;, capsize = 4)
plt.plot(x1, result.best_fit, &#39;o&#39;, label=&#39;best fit&#39;)
_pts = np.linspace(x1[0], x1[-1], 1000)
plt.plot(_pts, result.eval(x=_pts), label=&#39;interpolated fit&#39;)
plt.title(&quot;Amplificazione ai capi di R&quot;)
plt.xlabel(&quot;frequenza [kHz]&quot;)
plt.ylabel(r&quot;A = $V_{out}/V_{in}$&quot;)
plt.legend()
plt.tick_params(bottom = False)
plt.grid(color=&#39;#EEE&#39;)
frame1a.set_xticklabels([])
#
# ....

would give a fit report of

[[Model]]
Model(fitfunc1)
[[Fit Statistics]]
# fitting method   = leastsq
# function evals   = 65
# data points      = 47
# variables        = 5
chi-square         = 400.284733
reduced chi-square = 9.53058888
Akaike info crit   = 110.675341
Bayesian info crit = 119.926079
R-squared          = -673.477539
[[Variables]]
A:    30.9354753 +/- 0.67812562 (2.19%) (init = 40)
R:    23.3162313 +/- 1003.45159 (4303.66%) (init = 30)
L:    0.25082189 +/- 10.7859228 (4300.23%) (init = 0.3)
C:    5.8633e-07 +/- 2.5214e-05 (4300.36%) (init = 5e-07)
Off:  6.8886e-04 +/- 4.3907e-04 (63.74%) (init = 0)
[[Correlations]] (unreported correlations are &lt; 0.100)
C(L, C)   = -1.0000
C(R, L)   = +1.0000
C(R, C)   = -1.0000
C(A, Off) = -0.8021
C(A, R)   = +0.5709

showing huge uncertainties for R, L, and C because their correlations all exceed 0.999.

Simply fixing L (by uncommenting # params[&#39;L&#39;].vary = False) would give a fit with the same value for chi-square and a report of

[[Model]]
Model(fitfunc1)
[[Fit Statistics]]
# fitting method   = leastsq
# function evals   = 26
# data points      = 47
# variables        = 4
chi-square         = 400.284733
reduced chi-square = 9.30894728
Akaike info crit   = 108.675341
Bayesian info crit = 116.075931
R-squared          = -673.477539
[[Variables]]
A:    30.9354748 +/- 0.55046897 (1.78%) (init = 40)
R:    27.8877948 +/- 0.84442502 (3.03%) (init = 30)
L:    0.3 (fixed)
C:    4.9022e-07 +/- 5.3035e-10 (0.11%) (init = 5e-07)
Off:  6.8887e-04 +/- 3.9717e-04 (57.66%) (init = 0)
[[Correlations]] (unreported correlations are &lt; 0.100)
C(A, R)   = +0.7666
C(A, Off) = -0.7613
C(R, Off) = -0.5502

FWIW, the plot using the above modifications of your code would look like

无法使用Python的scipy在Jupyterlab中将数据拟合到给定函数。

huangapple
  • 本文由 发表于 2023年5月10日 23:36:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76220305.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定