2023年5月21日 09:28:59go评论100阅读模式

英文:

Python Sklearn multi-linear Regression for probabilities - normalize coefficients to 1

问题

以下是您提供的内容的翻译：

问题很简单。我有三个方程式：OmegaMIX = beta1 * Omega1 + beta2 * Omega2，我想通过线性回归找到最佳系数，其中截距为0。我已经得到了代码，但beta1和beta2是概率，所以必须满足beta1 + beta2 = 1，但实际情况并非如此。应该如何设置？

这是现有的代码：

from sklearn import linear_model
inputfilename = 'JO.csv' #输入
df = pd.read_csv(inputfilename)
x = df.drop('OmegaMIX', axis=1) #参考温度
y = df['OmegaMIX'] #多参数LIRs
regr = linear_model.LinearRegression(positive=True, fit_intercept=False)
regr.fit(x, y)
print('β1 = ', regr.coef_[0])
print('β2 = ', regr.coef_[1])
print("质量 = ", regr.score(x, y, sample_weight=None))

输出结果为：

β1 =  0.33995522604783796
β2 =  0.5794270911721245
质量 =  0.9995968335914979

输入文件为：

OmegaMIX,Omega1,Omega2
2.70,4.43,2.09
1.84,3.00,1.37
0.50,1.19,0.17

英文:

The question is simple. I have three equations: OmegaMIX = beta1Omega1+beta2Omega2 and I want to find the best coefficients by the linear regression, where intercept is 0. I have obtained the code, but the beta1 and beta2 are probabilities, so it must be beta1+beta2 = 1, which is not the case. How can this be set?
This is the existing code:

from sklearn import linear_model
inputfilename = &#39;JO.csv&#39; #input
df = pd.read_csv(inputfilename)
x = df.drop(&#39;OmegaMIX&#39;,axis=1) #Reference temperature
y = df[&#39;OmegaMIX&#39;] #Multiparametric LIRs
regr = linear_model.LinearRegression(positive=True,fit_intercept=False)
regr.fit(x, y)
print(&#39;\u03B21 = &#39;, regr.coef_[0])
print(&#39;\u03B22 = &#39;, regr.coef_[1])
print(&quot;Quality = &quot;, regr.score(x, y, sample_weight=None))

and the output is:

β1 =  0.33995522604783796
β2 =  0.5794270911721245
Quality =  0.9995968335914979

The input file is:

OmegaMIX,Omega1,Omega2
2.70,4.43,2.09
1.84,3.00,1.37
0.50,1.19,0.17

答案1

得分: 1

我不认为这可以在sklearn上完成。我更愿意使用CVXPY，因为你可以控制约束条件。

这是一个示例：

import pandas as pd
import cvxpy as cp
df = pd.read_csv('JO.csv')
# 设置矩阵X和向量y，cvxpy需要它们是numpy数组
X = df.drop('OmegaMIX', axis=1).values
y = df['OmegaMIX'].values
# 定义问题的变量
beta1 = cp.Variable()
beta2 = cp.Variable()
# 设置约束条件
constraints = [beta1 >= 0, beta2 >= 0, beta1 + beta2 == 1]
# 定义线性回归问题
objective = cp.Minimize(cp.sum_squares(X[:, 0] * beta1 + X[:, 1] * beta2 - y))
# 解决问题
problem = cp.Problem(objective, constraints)
problem.solve()
# 打印beta值
print('β1 =', beta1.value)
print('β2 =', beta2.value)
print('β1 + β2 =', (beta1.value + beta2.value))
print("质量 =", problem.value)

请注意，这是原文的翻译。

英文:

I don't think this can be done on sklearn. I would rather use CVXPY as you can control the constraints.

Here's an example :

import pandas as pd
import cvxpy as cp
df = pd.read_csv(&#39;JO.csv&#39;)
# Setting out matrix X and our vector y, cvxpy needs them to be numpy arrays
X = df.drop(&#39;OmegaMIX&#39;, axis=1).values
y = df[&#39;OmegaMIX&#39;].values
# Defining the variables of the problem
beta1 = cp.Variable()
beta2 = cp.Variable()
# setting the constraints
constraints = [beta1 &gt;= 0, beta2 &gt;= 0, beta1 + beta2 == 1]
# Defining the linear regression problem
objective = cp.Minimize(cp.sum_squares(X[:, 0] * beta1 + X[:, 1] * beta2 - y))
# Solving the probelem
problem = cp.Problem(objective, constraints)
problem.solve()
# printing the betas
print(&#39;\u03B21 =&#39;, beta1.value)
print(&#39;\u03B22 =&#39;, beta2.value)
print(&#39;\u03B21 + \u03B22 =&#39;, (beta1.value + beta2.value))
print(&quot;Quality =&quot;, problem.value)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python Sklearn 多元线性回归用于概率 – 将系数归一化为1

问题

答案1

在Odoo中搜索字段的搜索方法优化

NeoVim ugly text next to variable assignment

使用Google Cloud Scheduler调度作业。

OSError: 未找到名为”cairo-2″的库 – 我正在使用 Mac M1 芯片

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。