随机森林分类给出每一个类别的概率

2023-12-13 12:42:58

上一篇介绍了决策树，如何给出类别概率，那么很自然就想了随机森林。细节不说了，直接看代码。

from sklearn.datasets import load_iris
from sklearn import tree
from sklearn.ensemble import RandomForestClassifier
import numpy as np
import graphviz


x = load_iris().data
y = load_iris().target
feature_names = load_iris().feature_names
np.random.seed(1)
np.random.shuffle(x)
np.random.seed(1)
np.random.shuffle(y)

x_train = x[:10, :]
y_train = y[:10]

model = RandomForestClassifier(n_estimators=3).fit(x_train, y_train)
probility = model.predict_proba(x[22, :].reshape(1, -1))
print(feature_names)
print(x[145, :])
print(probility)
all_graph = []
for i in range(3):
    dot_data = tree.export_graphviz(model.estimators_[i],
                                    feature_names=feature_names,
                                    filled=True,
                                    rounded=True)

    graph = graphviz.Source(dot_data)

解释如下：

['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

[6.3 2.8 5.1 1.5]

根据estimators_[0] 可以得到样本属于第3类的可能性100%

根据estimators_[1] 可以得到样本属于第2类的可能性100%

根据estimators_[2] 可以得到样本属于第3类的可能性100%

所以属于第二类可能性为1 属于第二类的可能性为2

1/（1+2）=0.33333333

模型结果为：[[0. 0.33333333 0.66666667]]，与计算结果一致。

码农公寓

相关文章