在Python中绘制多元线性回归模型图。

huangapple go评论105阅读模式
英文:

Plotting Multiple Linear Regression Model in Python

问题

我正在尝试在Python中绘制多元线性回归模型的结果,但输出是错误的,因为这里的薪水值都是零。薪水是一个依赖于年龄、工作经验等因素的因变量。

薪水值应该在30000到50000之间。然而,结果告诉了一个不同的故事。我错过了什么?

  1. # 所有所需的库
  2. import pandas as pd
  3. import warnings
  4. import numpy as np
  5. # 用于数据可视化
  6. import seaborn as sns
  7. # %matplotlib notebook
  8. import matplotlib.pyplot as plt
  9. from mpl_toolkits.mplot3d import Axes3D
  10. # %matplotlib inline
  11. %matplotlib widget
  12. # 用于构建所需的模型
  13. from sklearn import linear_model
  14. df = pd.read_csv('ml_data_salary.csv')
  15. # 绘制用于可视化多元线性回归模型的3-D图
  16. # 准备数据
  17. X = df[['age', 'YearsExperience']].values.reshape(-1, 2)
  18. Y = df['Salary']
  19. # 为每个维度创建范围
  20. x = X[:, 0]
  21. y = X[:, 1]
  22. z = Y
  23. xx_pred = np.linspace(25, 40, 30) # 年龄值的范围
  24. yy_pred = np.linspace(1, 10, 30) # 经验值的范围
  25. xx_pred, yy_pred = np.meshgrid(xx_pred, yy_pred)
  26. model_viz = np.array([xx_pred.flatten(), yy_pred.flatten()]).T
  27. # 使用前面构建的模型进行预测
  28. ols = linear_model.LinearRegression()
  29. model1 = ols.fit(X, Y)
  30. predicted = model1.predict(model_viz)
  31. # 使用模型的R^2分数来评估模型
  32. r2 = model1.score(X, Y)
  33. # 绘制模型可视化
  34. plt.style.use('default')
  35. fig = plt.figure(figsize=(12, 4))
  36. ax1 = fig.add_subplot(131, projection='3d')
  37. ax2 = fig.add_subplot(132, projection='3d')
  38. ax3 = fig.add_subplot(133, projection='3d')
  39. axes = [ax1, ax2, ax3]
  40. for ax in axes:
  41. ax.plot(x, y, z, color='k', zorder=15, linestyle='none', marker='o', alpha=0.5)
  42. ax.scatter(xx_pred.flatten(), yy_pred.flatten(), predicted, facecolor=(0, 0, 0, 0), s=20, edgecolor='#70b3f0')
  43. ax.set_xlabel('Age', fontsize=12)
  44. ax.set_ylabel('Experience', fontsize=12)
  45. ax.set_zlabel('Salary', fontsize=12)
  46. ax.locator_params(nbins=4, axis='x')
  47. ax.locator_params(nbins=5, axis='x')
  48. ax1.view_init(elev=27, azim=112)
  49. ax2.view_init(elev=16, azim=-51)
  50. ax3.view_init(elev=60, azim=165)
  51. fig.suptitle('Multi-Linear Regression Model Visualization ($R^2 = %.2f$)' % r2, fontsize=15, color='k')
  52. fig.tight_layout()

在Python中绘制多元线性回归模型图。

  1. <details>
  2. <summary>英文:</summary>
  3. I am trying to plot results of Multiple Linear Regression model in python but the output is wrong as salary values are all zero here. Salary is a dependent variable which depends on age, Years of Experience, etc.
  4. Salary values should be from 30000 to 50000. However, the results tell a different story. What am I missing?

all required libraries

import pandas as pd
import warnings
import numpy as np

For data visualizing

import seaborn as sns

#%matplotlib notebook
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
#%matplotlib inline
%matplotlib widget

For building the required model

from sklearn import linear_model

df = pd.read_csv('ml_data_salary.csv')

Plotting a 3-D plot for visualizing the Multiple Linear Regression Model

Preparing the data

X = df[['age', 'YearsExperience']].values.reshape(-1,2)
Y = df['Salary']

Create range for each dimension

x = X[:, 0]
y = X[:, 1]
z = Y

xx_pred = np.linspace(25, 40, 30) # range of age values
yy_pred = np.linspace(1, 10, 30) # range of experience values
xx_pred, yy_pred = np.meshgrid(xx_pred, yy_pred)
model_viz = np.array([xx_pred.flatten(), yy_pred.flatten()]).T

Predict using model built on previous step

ols = linear_model.LinearRegression()
model1 = ols.fit(X, Y)
predicted = model1.predict(model_viz)

Evaluate model by using it's R^2 score

r2 = model.score(X, Y)

Plot model visualization

plt.style.use('default')

fig = plt.figure(figsize=(12, 4))

ax1 = fig.add_subplot(131, projection='3d')
ax2 = fig.add_subplot(132, projection='3d')
ax3 = fig.add_subplot(133, projection='3d')

axes = [ax1, ax2, ax3]

for ax in axes:
ax.plot(x, y, z, color='k', zorder=15, linestyle='none', marker='o', alpha=0.5)
ax.scatter(xx_pred.flatten(), yy_pred.flatten(), predicted, facecolor=(0,0,0,0), s=20, edgecolor='#70b3f0')
ax.set_xlabel('Age', fontsize=12)
ax.set_ylabel('Experience', fontsize=12)
ax.set_zlabel('Salary', fontsize=12)
ax.locator_params(nbins=4, axis='x')
ax.locator_params(nbins=5, axis='x')

ax1.view_init(elev=27, azim=112)
ax2.view_init(elev=16, azim=-51)
ax3.view_init(elev=60, azim=165)

fig.suptitle('Multi-Linear Regression Model Visualization ($R^2 = %.2f$)' % r2, fontsize=15, color='k')

fig.tight_layout()

  1. ![enter image description here](https://i.stack.imgur.com/tKW3D.png)
  2. </details>
  3. # 答案1
  4. **得分**: 0
  5. 我的使用的数据混乱了。我使用了Kaggle数据集,它运行良好。谢谢。
  6. <details>
  7. <summary>英文:</summary>
  8. The data I was using was messed up. I used the Kaggle dataset and it worked fine. Thank you ands.
  9. </details>

huangapple
  • 本文由 发表于 2023年7月7日 00:36:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76630901.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定