[云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

[云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

 [云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

 [云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

 [云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

05 衡量回归算法的标准,MSE vs MAE

In [3]:

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
import datetime;print("Run by CYJ,",datetime.datetime.now())
Run by CYJ, 2022-01-20 12:53:42.123449

波士顿房产数据

In [4]:

boston = datasets.load_boston()

In [5]:

boston.keys()

Out[5]:

dict_keys(['data', 'target', 'feature_names', 'DESCR', 'filename'])

In [6]:

# print(boston.DESCR)

In [7]:

boston.feature_names

Out[7]:

array(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD',
       'TAX', 'PTRATIO', 'B', 'LSTAT'], dtype='<U7')

In [13]:

x = boston.data[:,5] # 只使用房间数量这个特征

In [14]:

x.shape

Out[14]:

(506,)

In [15]:

y = boston.target

In [16]:

y.shape

Out[16]:

(506,)

In [17]:

plt.scatter(x, y)
plt.show()

[云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

In [18]:

np.max(y)

Out[18]:

50.0

In [19]:

x = x[y < 50.0]
y = y[y < 50.0]

In [20]:

x.shape

Out[20]:

(490,)

In [21]:

y.shape

Out[21]:

(490,)

In [22]:

plt.scatter(x, y)
plt.show()

[云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

使用简单线性回归法

In [23]:

from playML.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, seed=666)

In [24]:

x_train.shape

Out[24]:

(392,)

In [25]:

y_train.shape

Out[25]:

(392,)

In [26]:

x_test.shape

Out[26]:

(98,)

In [27]:

y_test.shape

Out[27]:

(98,)

In [28]:

from playML.SimpleLinearRegression import SimpleLinearRegression

In [29]:

reg = SimpleLinearRegression()
reg.fit(x_train, y_train)

Out[29]:

SimpleLinearRegression()

In [30]:

reg.a_

Out[30]:

7.8608543562689555

In [31]:

reg.b_

Out[31]:

-27.459342806705543

In [32]:

plt.scatter(x_train, y_train)
plt.plot(x_train, reg.predict(x_train), color='r')
plt.show()

[云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

In [35]:

plt.scatter(x_train, y_train)
plt.scatter(x_test, y_test, color="c")
plt.plot(x_train, reg.predict(x_train), color='r')
plt.show()

[云炬python3玩转机器学习]5-5 衡量回归算法的标准,MSE vs MAE

In [36]:

y_predict = reg.predict(x_test)

MSE

In [37]:

mse_test = np.sum((y_predict - y_test)**2) / len(y_test)
mse_test

Out[37]:

24.156602134387438

RMSE

In [38]:

from math import sqrt

rmse_test = sqrt(mse_test)
rmse_test

Out[38]:

4.914936635846635

MAE

In [39]:

mae_test = np.sum(np.absolute(y_predict - y_test))/len(y_test)
mae_test

Out[39]:

3.5430974409463873

封装我们自己的评测函数

代码参见 这里

In [40]:

from playML.metrics import mean_squared_error
from playML.metrics import root_mean_squared_error
from playML.metrics import mean_absolute_error

In [41]:

mean_squared_error(y_test, y_predict)

Out[41]:

24.156602134387438

In [42]:

root_mean_squared_error(y_test, y_predict)

Out[42]:

4.914936635846635

In [43]:

mean_absolute_error(y_test, y_predict)

Out[43]:

3.5430974409463873

scikit-learn中的MSE和MAE

In [44]:

from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error

In [45]:

mean_squared_error(y_test, y_predict)

Out[45]:

24.156602134387438

In [46]:

mean_absolute_error(y_test, y_predict)

Out[46]:

3.5430974409463873

MSE v.s. MAE

上一篇:vs code


下一篇:vs编译BZip2