算法进阶--SVM实践
分类器指标(再谈)
在前面precision,recall 以及F1评判指标下引入
F
β
F_\beta
Fβ:
F
β
=
(
1
+
β
)
⋅
p
r
e
c
i
s
i
o
n
⋅
r
e
c
a
l
l
β
2
⋅
p
r
e
c
i
s
i
o
n
+
r
e
c
a
l
l
F_{\beta}=\frac{(1+\beta)\cdot precision \cdot recall}{\beta^{2} \cdot precision+recall}
Fβ=β2⋅precision+recall(1+β)⋅precision⋅recall
- 其中, β 2 \beta^{2} β2越小,表明越重视precision
svm初步使用
import numpy as np
import pandas as pd
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 利用SVM对鸢尾花进行分类
#导入鸢尾花数据集
data = pd.read_csv('./iris.csv')
print(data)
#选择特征值X,为花萼长和花萼宽为和目标值Y
X,Y = data.iloc[:,[2,3]],data.iloc[:,5]
#将目标值Y分类成0,1,2 三个类别
Y=pd.Categorical(Y).codes
#拆分数据
x_train,x_test,y_train,y_test = train_test_split(X,Y,random_state=1,test_size=0.75)
#选择svm分类器并开始分类
clf = svm.SVC(C=0.1,kernel='linear',decision_function_shape='ovr')
clf.fit(x_train,y_train)
#准确率
y_hat = clf.predict(x_test)
print(y_hat)
print('准确率为:',clf.score(x_test,y_test))
print('准确率为:',accuracy_score(y_hat,y_test))
输出:
Unnamed: 0 Sepal.Length ... Petal.Width Species
0 1 5.1 ... 0.2 setosa
1 2 4.9 ... 0.2 setosa
2 3 4.7 ... 0.2 setosa
3 4 4.6 ... 0.2 setosa
4 5 5.0 ... 0.2 setosa
.. ... ... ... ... ...
145 146 6.7 ... 2.3 virginica
146 147 6.3 ... 1.9 virginica
147 148 6.5 ... 2.0 virginica
148 149 6.2 ... 2.3 virginica
149 150 5.9 ... 1.8 virginica
[150 rows x 6 columns]
[0 1 1 0 2 1 1 0 0 2 1 0 2 1 1 0 1 1 0 0 1 1 1 0 2 1 0 0 1 1 1 2 1 2 1 0 1
0 1 2 2 0 1 2 1 2 0 0 0 1 0 0 2 2 2 2 1 1 2 1 0 1 1 0 0 2 0 1 1 1 1 2 1 0
1 1 2 1 2 1 0 0 0 2 0 1 2 1 0 0 1 0 2 1 2 2 1 2 2 1 0 1 0 1 1 0 1 0 0 2 1
2 0]
准确率为: 0.8938053097345132
准确率为: 0.8938053097345132
Process finished with exit code 0