错误消息:拟合两个变量之间的非线性指数模型时。

huangapple go评论69阅读模式
英文:

error messages fitting a non-linear exponential model between two variables

问题

我有两个变量,我正在尝试建立它们之间的关系模型并提取残差。这两个变量之间的关系明显是非线性指数关系。我尝试了几种不同的nls方法,但总是收到不同的错误消息。

# 数据集
df <- structure(list(y = c(464208.56, 334962.43, 361295.68, 426535.68, 258843.93, 272855.46, 
   166322.72, 244695.28, 227003.03, 190728.4, 156025.45, 72594.24, 56911.4, 175328.95, 161199.76, 
   152520.77, 190610.57, 60734.34, 31620.9, 74518.86, 45524.49, 2950.58, 2986.38, 15961.77, 12484.05, 
   6828.41, 2511.72, 1656.12, 5271.4, 7550.66, 3357.71, 3620.43, 3699.85, 3337.56, 4106.55, 3526.66, 
   2996.79, 1649.89, 4561.64, 1724.25, 3877.2, 4426.69, 8557.61, 6021.61, 6074.17, 4072.77, 4032.95, 
   5280.16, 7127.22), 
   x = c(39.23, 38.89, 38.63, 38.44, 38.32, 38.27, 38.3, 38.4, 38.56, 38.79, 39.06, 39.36, 39.68, 
   40.01, 40.34, 40.68, 41.05, 41.46, 41.93, 42.48, 43.14, 43.92, 44.84, 45.9, 47.1, 48.4, 49.78, 
   51.2, 52.62, 54.01, 55.31, 56.52, 57.6, 58.54, 59.33, 59.98, 60.46, 60.78, 60.94, 60.92, 60.71, 
   60.3, 59.69, 58.87, 57.86, 56.67, 55.33, 53.87, 52.33)), 
   row.names = c(NA, -49L), 
   class = c("tbl_df", "tbl", "data.frame"), 
   na.action = structure(c(`1` = 1L, `51` = 51L), 
   class = "omit"))

# 初始模型
m <- nls(y ~  a * exp(r * x), 
         start = list(a = 0.5, r = -0.2), 
         data = df)
Error in nls(y ~ a * exp(r * x), start = list(a = 0.5, r = -0.2), data = df,  : singular gradient

# 添加alg项
m <- nls(y ~  a * exp(r * x), 
         start = list(a = 0.5, r = -0.2), 
         data = df,
         alg = "plinear")
Error in nls(y ~ a * exp(r * x), start = list(a = 0.5, r = -0.2), data = df,  : 
  step factor 0.000488281 reduced below 'minFactor' of 0.000976562

错误消息:拟合两个变量之间的非线性指数模型时。

英文:

I have two variables that I'm trying to model the relationship between and extract the residuals. The relationship between the two variables is clearly a non-linear exponential relationship. I've tried a few different approaches with nls, but I keep getting different error messages.

错误消息:拟合两个变量之间的非线性指数模型时。

# dataset
df &lt;- structure(list(y = c(464208.56, 334962.43, 361295.68, 426535.68, 258843.93, 272855.46, 
166322.72, 244695.28, 227003.03, 190728.4, 156025.45, 72594.24, 56911.4, 175328.95, 161199.76, 
152520.77, 190610.57, 60734.34, 31620.9, 74518.86, 45524.49, 2950.58, 2986.38, 15961.77, 12484.05, 
6828.41, 2511.72, 1656.12, 5271.4, 7550.66, 3357.71, 3620.43, 3699.85, 3337.56, 4106.55, 3526.66, 
2996.79, 1649.89, 4561.64, 1724.25, 3877.2, 4426.69, 8557.61, 6021.61, 6074.17, 4072.77, 4032.95, 
5280.16, 7127.22), 
x = c(39.23, 38.89, 38.63, 38.44, 38.32, 38.27, 38.3, 38.4, 38.56, 38.79, 39.06, 39.36, 39.68, 
40.01, 40.34, 40.68, 41.05, 41.46, 41.93, 42.48, 43.14, 43.92, 44.84, 45.9, 47.1, 48.4, 49.78, 
51.2, 52.62, 54.01, 55.31, 56.52, 57.6, 58.54, 59.33, 59.98, 60.46, 60.78, 60.94, 60.92, 60.71, 
60.3, 59.69, 58.87, 57.86, 56.67, 55.33, 53.87, 52.33)), 
row.names = c(NA, -49L), 
class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;), 
na.action = structure(c(`1` = 1L, `51` = 51L), 
class = &quot;omit&quot;))
# initial model
m &lt;- nls(y ~  a * exp(r * x), 
start = list(a = 0.5, r = -0.2), 
data = df)
Error in nls(y ~ a * exp(r * x), start = list(a = 0.5, r = -0.2), data = df,  : singular gradient
# add term for alg
m &lt;- nls(y ~  a * exp(r * x), 
start = list(a = 0.5, r = -0.2), 
data = df,
alg = &quot;plinear&quot;)
Error in nls(y ~ a * exp(r * x), start = list(a = 0.5, r = -0.2), data = df,  : 
step factor 0.000488281 reduced below &#39;minFactor&#39; of 0.000976562

答案1

得分: 3

log-Gaussian GLM

正如@Gregor Thomas建议的,您可以线性化您的问题(拟合对数线性回归),但会改变错误模型。 (基本模型诊断,即比例-位置图,建议这将是一个**更好的统计模型!)但是,您可以通过拟合对数链接的高斯GLM来有效地实现这一点,而不改变错误结构:

m1 <- glm(y ~ x, family = gaussian(link = "log"), data = df)

该模型为y ~ Normal(exp(b0 + b1*x), s),因此a = exp(b0)r = b1

我尝试使用list(a=exp(coef(m1)[1]), r=coef(m1)[2])作为起始值,但即使这样对于nls()来说也太过挑剔。

有两种方法可以让nls正常工作。

转换为指数形式

正如@GregorThomas建议的,将x轴转换为x=38也可以很好地工作(在给定合理的起始值的情况下):

m <- nls(y ~  a * exp(r * (x-38)), 
         start = list(a = 3e5, r = -0.35), 
         data = df)

为nls提供梯度

如果您请求得当,deriv函数将生成一个具有正确结构的函数供nls使用(返回目标函数,具有给出导数向量的“.grad”属性)。 (我还使用了来自对数链接的高斯GLM的指数化截距作为起始值...)

f <- deriv( ~ a*exp(r*x), c("a", "r"), function.arg = c("x", "a", "r"))
m2 <- nls(y ~  f(x, a, r),
         start = list(a = exp(coef(m1)[1]), r = -0.35),
         data = df)

我们可以绘制这些以比较预测值(在视觉上相同):

par(las = 1, bty = "l")
xvec <- seq(38, 60, length = 101)
plot(y ~ x, df)
lines(xvec, predict(m1, newdata = data.frame(x=xvec), type = "response"),
      col = 2)
lines(xvec, predict(m, newdata = data.frame(x=xvec)), col = 4,  lty = 2)
lines(xvec, predict(m2, newdata = data.frame(x=xvec)), col = 5,  lty = 2)

错误消息:拟合两个变量之间的非线性指数模型时。

通过一些额外的工作(指数化高斯GLM的截距,将x原点重新移动到零以进行nls拟合),我们可以比较系数(只在2e-4的公差范围内相等,但这应该足够好,对吗?)

a1 <- exp(coef(m1)[[1]])
a2 <- coef(m)[[1]]*exp(-38*coef(m)[[2]])
all.equal(c(a = a1, r = coef(m)[[2]]),
          c(a = a2, r = coef(m1)[[2]]), tolerance = 1e-4)
all.equal(c(a = a1, r = coef(m)[[2]]),
          coef(m2), tolerance = 2e-4)
英文:

log-Gaussian GLM

As @Gregor Thomas suggests you could linearize your problem (fit a log-linear regression), at the cost of changing the error model. (Basic model diagnostics, i.e. a scale-location plot, suggest that this would be a much better statistical model!) However, you can do this efficiently without changing the error structure by fitting a log-link Gaussian GLM:

m1 &lt;- glm(y ~ x, family = gaussian(link = &quot;log&quot;), data = df)

The model is y ~ Normal(exp(b0 + b1*x), s), so a = exp(b0), r = b1.

I tried using list(a=exp(coef(m1)[1]), r=coef(m1)[2]) as starting values, but even this was too finicky for nls().

There are two ways to get nls to work.

shifted exponential

As @GregorThomas suggests, shifting the x-axis to x=38 also works fine (given a sensible starting value):

m &lt;- nls(y ~  a * exp(r * (x-38)), 
         start = list(a = 3e5, r = -0.35), 
         data = df)

provide nls with a gradient

The deriv function will generate a function with the right structure for nls (returns the objective function, with a ".grad" attribute giving a vector of derivatives) if you ask it nicely. (I'm also using the exponentiated intercept from the log-Gaussian GLM as a starting value ...)

f &lt;- deriv( ~ a*exp(r*x), c(&quot;a&quot;, &quot;r&quot;), function.arg = c(&quot;x&quot;, &quot;a&quot;, &quot;r&quot;))
m2 &lt;- nls(y ~  f(x, a, r),
         start = list(a = exp(coef(m1)[1]), r = -0.35),
         data = df)

We can plot these to compare the predictions (visually identical):

par(las = 1, bty = &quot;l&quot;)
xvec &lt;- seq(38, 60, length = 101)
plot(y ~ x, df)
lines(xvec, predict(m1, newdata = data.frame(x=xvec), type = &quot;response&quot;),
      col = 2)
lines(xvec, predict(m, newdata = data.frame(x=xvec)), col = 4,  lty = 2)
lines(xvec, predict(m2, newdata = data.frame(x=xvec)), col = 5,  lty = 2)

错误消息:拟合两个变量之间的非线性指数模型时。

With a little bit of extra work (exponentiating the intercept for the Gaussian GLM, shifting the x-origin back to zero for the nls fit) we can compare the coefficients (only equal up to a tolerance of 2e-4 but that should be good enough, right?)

a1 &lt;- exp(coef(m1)[[1]])
a2 &lt;- coef(m)[[1]]*exp(-38*coef(m)[[2]])
all.equal(c(a = a1, r = coef(m)[[2]]),
          c(a = a2, r = coef(m1)[[2]]), tolerance = 1e-4)
all.equal(c(a = a1, r = coef(m)[[2]]),
          coef(m2), tolerance = 2e-4)

huangapple
  • 本文由 发表于 2023年6月2日 00:48:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76384082.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定