英文:
Curve fitting with lmfit for Mossbauer spectroscopy
问题
我对lmfit还比较新,经常遇到这个错误。我的数据集相当大,有大约27000个数据点,我收集到了这些数据:
[我的错误消息](https://i.stack.imgur.com/uh87r.png)
```python
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
import pandas as pd
plt.rcParams['figure.dpi'] = 300
plt.rcParams['savefig.dpi'] = 300
# 使用负号定义洛伦兹函数
def lorentzian(x, amp, cen, wid):
return (-amp * wid ** 2 / ((x - cen) ** 2 + wid ** 2))
# 定义两个洛伦兹峰的总和
def multi_lorentzian(x, a1, c1, w1, a2, c2, w2):
return (lorentzian(x, a1, c1, w1) + lorentzian(x, a2, c2, w2))
csv_names = ['Channel', 'Energy', 'Counts']
fp2 = "C:\\Users\\mfern\\Downloads\\Data_3_8.csv"
df = pd.read_csv(fp2, skiprows=23, names=csv_names).dropna() # 删除包含NaN值的行
xdata = df['Channel']
ydata = df['Counts']
model = Model(multi_lorentzian)
params = model.make_params(a1=0, c1=0, w1=0, a2=0, c2=0, w2=0)
params['a1'].min = 0
params['a2'].min = 0
params['w1'].min = 0
params['w2'].min = 0
result = model.fit(ydata, params=params, x=xdata)
# 打印拟合报告
print(result.fit_report())
# 绘制数据和拟合曲线
plt.scatter(xdata, ydata, label='data', s=2, c='b')
plt.plot(xdata, result.best_fit, 'r-', label='fit')
plt.legend()
所以如果有人可以给我一些启示,我将不胜感激 [这是我的数据的样子]
(https://i.stack.imgur.com/Kg0yq.png)
这是我的数据集链接(使用谷歌驱动器链接):
1: https://drive.google.com/file/d/1ho5b1iGV9_wft9h_e6xP9KRtQgGCufcD/view?usp=sharing
<details>
<summary>英文:</summary>
Im pretty new to lmfit and I keep running into this error. My dataset is pretty large ~27000 points of data from I gather:
[my error msg](https://i.stack.imgur.com/uh87r.png)
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
import pandas as pd
plt.rcParams['figure.dpi'] = 300
plt.rcParams['savefig.dpi'] = 300
Define the Lorentzian function with a negative sign
def lorentzian(x, amp, cen, wid):
return (-amp * wid ** 2 / ((x - cen) ** 2 + wid ** 2))
Define the sum of two Lorentzian peaks
def multi_lorentzian(x, a1, c1, w1, a2, c2, w2):
return (lorentzian(x, a1, c1, w1) + lorentzian(x, a2, c2, w2))
csv_names = ['Channel', 'Energy', 'Counts']
fp2 = "C:\Users\mfern\Downloads\Data_3_8.csv"
df = pd.read_csv(fp2, skiprows=23, names=csv_names).dropna() # Remove rows with NaN values
xdata = df['Channel']
ydata = df['Counts']
model = Model(multi_lorentzian)
params = model.make_params(a1=0, c1=0, w1=0, a2=0, c2=0, w2=0)
params['a1'].min = 0
params['a2'].min = 0
params['w1'].min = 0
params['w2'].min = 0
result = model.fit(ydata, params=params, x=xdata)
Print the fit report
print(result.fit_report())
Plot the data and the fit curve
plt.scatter(xdata, ydata, label='data', s=2, c='b')
plt.plot(xdata, result.best_fit, 'r-', label='fit')
plt.legend()
So if there is anyone that can enlighten me, I would greatly appreciate it [this is what my data looks like btw]
(https://i.stack.imgur.com/Kg0yq.png)
AND this is link to my data set (with google drive link):
[1]: https://drive.google.com/file/d/1ho5b1iGV9_wft9h_e6xP9KRtQgGCufcD/view?usp=sharing
</details>
# 答案1
**得分**: 1
以下是您提供的代码的中文翻译:
"我不确定,但我认为您所看到的错误可能来自您正在使用的数据集。您没有提供数据,所以很难确定。
我认为您对您的模型的一般方法是正确的,但缺少一些功能。首先,您的洛伦兹分布在远离它们的中心点时将具有零值,而您的莫斯堡尔数据在大约为25000附近。
更重要的是,您绝对需要为参数提供一些相对现实的初始值,而这些值不应位于您设置的边界上。如果不明显,将所有值都设置为零将会产生一个与您的数据完全不同的模型曲线。我不能再强调这一点了:您绝对必须提供初始值,这些初始值应该足够接近,以便它们可以通过小幅改变而在“数据-模型”结果中产生明显变化。
由于您没有提供数据,我制作了一些看起来有点接近的数据,并使用了内置的lmfit模型,类似于以下内容,这可能是一个不错的起点:
```python
#!/usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
from lmfit.models import ConstantModel, LorentzianModel
from lmfit.lineshapes import lorentzian
npts = 2001
velo = np.linspace(0, 1000.0, npts)
signal = (26900 + np.random.normal(size=npts, scale=100)
- 80000. * lorentzian(velo, center=240, sigma=6)
- 79000. * lorentzian(velo, center=790, sigma=7) )
# 模型 = 常数模型 + 2个洛伦兹分布
model = ConstantModel() + LorentzianModel(prefix='p1_') + LorentzianModel(prefix='p2_')
# 猜测初始值
params = model.make_params(c=25000,
p1_center=250, p1_sigma=10, p1_amplitude=-10000,
p2_center=750, p2_sigma=10, p2_amplitude=-10000)
params['p1_amplitude'].max = 0 # 强制p1_amplitude < 0
params['p2_amplitude'].max = 0
out = model.fit(signal, params, x=velo)
print(out.fit_report())
plt.plot(velo, signal, 'o', label='data')
plt.plot(velo, out.best_fit, label='fit')
plt.legend()
plt.show()
这将生成以下报告:
[[模型]]
((模型(常数) + 模型(洛伦兹分布, 前缀='p1_')) + 模型(洛伦兹分布, 前缀='p2_'))
[[拟合统计信息]]
# 拟合方法 = leastsq
# 函数评估次数 = 109
# 数据点数 = 2001
# 变量数 = 7
卡方值 = 19912468.9
减小的卡方值 = 9986.19301
阿卡伊克信息准则 = 18434.1141
贝叶斯信息准则 = 18473.3239
R方 = 0.96651410
[[变量]]
c: 26897.4478 +/- 2.43041838 (0.01%) (init = 25000)
p1_amplitude: -80171.1149 +/- 638.944430 (0.80%) (init = -10000)
p1_center: 239.908167 +/- 0.04587653 (0.02%) (init = 250)
p1_sigma: 5.99624015 +/- 0.06621609 (1.10%) (init = 10)
p2_amplitude: -78808.0930 +/- 694.381394 (0.88%) (init = -10000)
p2_center: 789.912879 +/- 0.05883249 (0.01%) (init = 750)
p2_sigma: 6.99732305 +/- 0.08516581 (1.22%) (init = 10)
p1_fwhm: 11.9924803 +/- 0.13243219 (1.10%) == '2.0000000*p1_sigma'
p1_height: -4255.87684 +/- 32.5617804 (0.77%) == '0.3183099*p1_amplitude/max(1e-15, p1_sigma)'
p2_fwhm: 13.9946461 +/- 0.17033163 (1.22%) == '2.0000000*p2_sigma'
p2_height: -3584.99901 +/- 30.1429487 (0.84%) == '0.3183099*p2_amplitude/max(1e-15, p2_sigma)'
[[相关性]] (未报告的相关性 < 0.100)
C(p2_amplitude, p2_sigma) = -0.7230
C(p1_amplitude, p1_sigma) = -0.7211
C(c, p2_amplitude) = -0.2989
C(c, p1_amplitude) = -0.2800
C(c, p2_sigma) = +0.2133
C(c, p1_sigma) = +0.1998
以及如下的绘图:
英文:
I'm not certain, but I think the error you are seeing may come from the dataset you are using. You didn't provide data, so it's hard to know for sure.
I think the general approach to the model you have is OK, but missing some features. For one thing, your Lorentzian's will have a value of zero far from their centroids, whereas your Mossbauer data is around 25000.
More importantly, you definitely need to give somewhat realistic initial values for the parameters, and those values should not at the boundary you set. If it is not obvious, setting all values to zero will give a model curve that looks nothing like your data. I cannot stress this strongly enough: you absolutely must provide initial values that are reasonably close such that they can be changed by a small amount and give a noticeable change in the result of "data-model".
Since you didn't provide data, I made some up that looks sort of close ;), and used the builtin lmfit models, something like this, which might be a good start:
#!/usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
from lmfit.models import ConstantModel, LorentzianModel
from lmfit.lineshapes import lorentzian
npts = 2001
velo = np.linspace(0, 1000.0, npts)
signal = (26900 + np.random.normal(size=npts, scale=100)
- 80000. * lorentzian(velo, center=240, sigma=6)
- 79000. * lorentzian(velo, center=790, sigma=7) )
# model = constant + 2 Lorentzians
model = ConstantModel() + LorentzianModel(prefix='p1_') + LorentzianModel(prefix='p2_')
# guess initial values
params = model.make_params(c=25000,
p1_center=250, p1_sigma=10, p1_amplitude=-10000,
p2_center=750, p2_sigma=10, p2_amplitude=-10000)
params['p1_amplitude'].max = 0 # enforce p1_amplitude < 0
params['p2_amplitude'].max = 0
out = model.fit(signal, params, x=velo)
print(out.fit_report())
plt.plot(velo, signal, 'o', label='data')
plt.plot(velo, out.best_fit, label='fit')
plt.legend()
plt.show()
This will give a report of
[[Model]]
((Model(constant) + Model(lorentzian, prefix='p1_')) + Model(lorentzian, prefix='p2_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 109
# data points = 2001
# variables = 7
chi-square = 19912468.9
reduced chi-square = 9986.19301
Akaike info crit = 18434.1141
Bayesian info crit = 18473.3239
R-squared = 0.96651410
[[Variables]]
c: 26897.4478 +/- 2.43041838 (0.01%) (init = 25000)
p1_amplitude: -80171.1149 +/- 638.944430 (0.80%) (init = -10000)
p1_center: 239.908167 +/- 0.04587653 (0.02%) (init = 250)
p1_sigma: 5.99624015 +/- 0.06621609 (1.10%) (init = 10)
p2_amplitude: -78808.0930 +/- 694.381394 (0.88%) (init = -10000)
p2_center: 789.912879 +/- 0.05883249 (0.01%) (init = 750)
p2_sigma: 6.99732305 +/- 0.08516581 (1.22%) (init = 10)
p1_fwhm: 11.9924803 +/- 0.13243219 (1.10%) == '2.0000000*p1_sigma'
p1_height: -4255.87684 +/- 32.5617804 (0.77%) == '0.3183099*p1_amplitude/max(1e-15, p1_sigma)'
p2_fwhm: 13.9946461 +/- 0.17033163 (1.22%) == '2.0000000*p2_sigma'
p2_height: -3584.99901 +/- 30.1429487 (0.84%) == '0.3183099*p2_amplitude/max(1e-15, p2_sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(p2_amplitude, p2_sigma) = -0.7230
C(p1_amplitude, p1_sigma) = -0.7211
C(c, p2_amplitude) = -0.2989
C(c, p1_amplitude) = -0.2800
C(c, p2_sigma) = +0.2133
C(c, p1_sigma) = +0.1998
and a plot of
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论