参考资料:
python机器学习库scikit-learn简明教程之:随机森林
http://nbviewer.jupyter.org/github/donnemartin/data-science-ipython-notebooks/blob/master/kaggle/titanic.ipynb
scikit-learn sklearn 0.18 官方文档中文版
https://github.com/jakevdp/sklearn_pycon2015
官网:http://scikit-learn.org/stable/
Scikit-learn (sklearn) 优雅地学会机器学习 (莫烦 Python 教程)
python机器学习库scikit-learn简明教程之:AdaBoost算法
http://www.docin.com/p-1775095945.html
https://www.bilibili.com/video/av22530538/?p=6
https://github.com/Fdevmsy/Image_Classification_with_5_methods
https://github.com/huangchuchuan/SVM-HOG-images-classifier
https://blog.csdn.net/always2015/article/details/47100713
DBScan https://www.cnblogs.com/pinard/p/6208966.html
1.KNN的使用
carto@cartoPC:~$ python
Python 2.7.12 (default, Dec 4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from sklearn import datasets
>>> from sklearn.cross_validation import train_test_split
/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
>>> from sklearn.neighbors import KNeighborsClassifier
>>> iris=datasets.load_iris()
>>> iris_X=iris.data
>>> iris_y=iris.target
>>> print(iris_X[:2,:])
[[ 5.1 3.5 1.4 0.2]
[ 4.9 3. 1.4 0.2]]
>>> print(iris_y)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
>>> X_train,X_test,y_train,y_test=train_test_split(iris_X,iris_y,test_size=0.3)
>>> print(y_train)
[2 1 0 0 0 2 0 0 1 1 2 2 1 1 2 2 2 0 1 0 2 2 1 1 1 1 1 0 1 1 0 2 1 0 0 2 2
0 0 2 1 0 0 2 1 2 1 2 1 1 1 2 1 2 0 2 0 1 1 2 1 0 1 2 2 0 2 2 1 0 1 1 2 2
1 0 1 1 2 0 0 1 0 1 0 2 0 1 1 0 2 1 2 0 2 0 2 0 2 1 0 2 0 2 2]
>>> knn=KNeighborsClassifier()
>>> knn.fit()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: fit() takes exactly 3 arguments (1 given)
>>> knn.fit(X_train,y_train)
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=5, p=2,
weights='uniform')
>>> print(knn.predict(X_test))
[1 1 2 0 1 1 1 1 2 0 0 2 0 1 0 0 0 1 2 2 2 2 0 1 2 0 1 2 2 0 1 2 0 0 1 0 0
0 0 1 0 1 1 2 0]
>>> print(y_test)
[1 1 2 0 1 1 1 1 2 0 0 2 0 1 0 0 0 1 2 2 2 2 0 1 2 0 1 2 2 0 2 2 0 0 2 0 0
0 0 1 0 1 1 2 0]
>>>
2.SVC的使用
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
def load_data():
iris=datasets.load_iris()
X_train,X_test,y_train,y_test=train_test_split(
iris.data,iris.target,test_size=0.10,random_state=0)
return X_train,X_test,y_train,y_test def test_LinearSVC(X_train,X_test,y_train,y_test):
cls=svm.LinearSVC()
cls.fit(X_train,y_train)
print('Coefficients:%s, intercept %s'%(cls.coef_,cls.intercept_))
print('Score: %.2f' %cls.score(X_test,y_test)) if __name__=="__main__":
X_train,X_test,y_train,y_test=load_data()
test_LinearSVC(X_train,X_test,y_train,y_test)
调用
carto@cartoPC:~/python_ws$ python svmtest2.py
Coefficients:[[ 0.18424504 0.45123335 -0.80794237 -0.45071267]
[-0.13381099 -0.75235247 0.57223898 -1.11494325]
[-0.7943601 -0.95801711 1.31465593 1.8169808 ]], intercept [ 0.10956304 1.86593164 -1.72576407]
Score: 1.00