机器学习基石-笔记3

2024-04-01 12:56:22

机器学习的类型

根据输出空间种类进行分类

1. 二分类，输出 Y = {−1, +1}:

PLA是线性二分类，二分类还有其他的

2. 多分类，输出 Y = {1, 2, · · · , K} (abstractly) 二分类是多分类的特殊情况（ K = 2）

Coin Recognition Problem

3. 回归，输出 Y = R or Y = [lower, upper] ⊂ R (bounded regression)

4. Structured Learning: Sequence Tagging Problem

sentence ⇒ structure (class of each word) 输出 Y = {PVN, PVP, NVN, PV, · · · }, not including VVVVV

总结，我们可以根据输出空间的类型来对学习算法进行分类

根据标签进行分类

1.监督学习 supervised learning: every xn comes with corresponding yn

2. 无监督学习 Unsupervised: every xn comes without corresponding yn

3. 半监督学习 Semi-supervised: some xn comes with corresponding yn （标签获取困难或者费时费力时采用这种方法）

4. 强化学习 Reinforcement Learning

总结，我们可以根据不同的标签来对学习算法进行分类

根据数据的训练方式对算法进行分类

1 batch learning 从所有已知的数据中训练学习

2 online learning 每次训练一个数据，然后根据结果对模型进行修正，然后训练下一个数据

3 activate learning Learning by ‘Asking’

根据输入的特征类型进行分类

1. Concrete Features

通常这些特征来自一定的先验知识，比如(size, mass) for coin classification，并且具有sophisticated physical meaning 。

2. Raw Features

这些特征通常具有‘simple physical meaning，比如图像的像素。因为数据简单抽象，机器训练起来比较难

raw features often need human（常常称之为特征工程） or machines to convert to concrete ones

3 Abstract Features

‘no physical meaning’; thus even more difficult for ML； again need ‘feature conversion/extraction/construction

总结

参考资料：

机器学习基石，完整版 - 林轩田 - *大学 https://www.bilibili.com/video/BV1W7411z7Ra?p=13