Day4 决策表

原文链接:决策表
Definition 1. A decision system is a 5-tuple S = ( U , C , D , V , I ) S = (\mathbf{U}, \mathbf{C}, \mathbf{D}, \mathbf{V}, I) S=(U,C,D,V,I), where

  • U = { x 1 , x 2 , … , x n } \mathbf{U} = \{x_1, x_2, \dots, x_n\} U={x1​,x2​,…,xn​} is the set of instances.
  • C = { a 1 , a 2 , … , a n } \mathbf{C} = \{a_1, a_2, \dots, a_n\} C={a1​,a2​,…,an​} is the set of conditions attributes.
  • D = { d 1 , d 2 , … , d n } \mathbf{D} = \{d_1, d_2, \dots, d_n\} D={d1​,d2​,…,dn​} is the set of decisional attributes.
  • V = ⋃ a ∈ C ∪ D V a \mathbf{V} = \bigcup_{a \in \mathbf{C} \cup \mathbf{D}} \mathbf{V}_a V=⋃a∈C∪D​Va​
  • V a \mathbf{V}_a Va​ is the domain of a ∈ C ∪ D a \in \mathbf{C} \cup \mathbf{D} a∈C∪D,
  • I : U × ( C ∪ D ) → V I: \mathbf{U} \times (\mathbf{C} \cup \mathbf{D}) \to \mathbf{V} I:U×(C∪D)→V is the information function.
  1. 写出下表中的 U , C , D , V \mathbf{U}, \mathbf{C}, \mathbf{D}, \mathbf{V} U,C,D,V. 注: 最后两个属性为决策属性.
    Day4 决策表

U = { x 1 , x 2 , … , x 7 } \mathbf{U} = \{x_1, x_2, \dots, x_7\} U={x1​,x2​,…,x7​}.
C = { Yes , No , High , Normal , Low } \mathbf{C} = \{\textrm{Yes}, \textrm{No}, \textrm{High}, \textrm{Normal}, \textrm{Low}\} C={Yes,No,High,Normal,Low}
D = { Normal , Abnormal , Yes , No } \mathbf{D} = \{ \textrm{Normal}, \textrm{Abnormal}, \textrm{Yes}, \textrm{No}\} D={Normal,Abnormal,Yes,No}
V = { Yes , No , High , Normal , Low , Abnormal } \mathbf{V} = \{\textrm{Yes}, \textrm{No}, \textrm{High}, \textrm{Normal}, \textrm{Low}, \textrm{Abnormal}\} V={Yes,No,High,Normal,Low,Abnormal}

  1. 定义一个标签分布系统, 即各标签的值不是 0/1, 而是 [0,1] 区间的实数, 且同一对象的标签和为 1.
    Definition: A label distribution system is a tuple S = ( X , Y ) S = (\mathbf{X}, \mathbf{Y}) S=(X,Y) where X = [ x i j ] n × m ∈ R n × m \mathbf{X} = [x_{ij}]_{n \times m} \in \mathbb{R}^{n \times m} X=[xij​]n×m​∈Rn×m is the data matrix, Y = [ y i k ] n × l ∈ [ 0 , 1 ] n × l \mathbf{Y} = [y_{ik}]_{n \times l} \in [0, 1]^{n \times l} Y=[yik​]n×l​∈[0,1]n×l is the label matrix and ∑ k = 1 l y i k = 1 \sum_{k=1}^l y_{ik} = 1 ∑k=1l​yik​=1, n n n is the number of instances, m m m is the number of features, and l l l is the number of labels.
上一篇:关于拷贝常规数组、std::array和std::vector速度的一些测试


下一篇:SQL学习