第一周学习总结
- 《Maximum Classififier Discrepancy for Unsupervised Domain Adaptation》
论文小结
一、《Maximum Classififier Discrepancy for Unsupervised Domain Adaptation》
1、Method
- 使用两个分类器(F1,F2),去寻找在Target中判定错误的区域。
2、Training Steps
-
Step A
-
train both classififiers and generator to classify the source samples correctly.
-
损失函数
-
L ( X S , Y S ) = − E ( x S , y s ) − ( X S , Y S ) ∑ k = 1 K 1 [ k = y S ] l o g p ( y ∣ x S ) L(X_S,Y_S) = -E_{(x_S,y_s)-(X_S,Y_S)}\sum^{K}_{k=1}{1_{[k=y_S]}logp(y|x_S)} L(XS,YS)=−E(xS,ys)−(XS,YS)k=1∑K1[k=yS]logp(y∣xS)
-
m i n G , F 1 , F 2 L ( X S , Y S ) min_{G,F1,F2} L(X_S,Y_S) minG,F1,F2L(XS,YS)
-
-
-
Step B
-
损失函数
-
m i n F 1 , F 2 L ( X S , Y S ) − L a d v ( X t ) min_{F1,F2} L(X_S,Y_S) - L_{adv}(X_t) minF1,F2L(XS,YS)−Ladv(Xt)
-
L a d v ( X t ) = E x t − X t [ d ( p 1 ( y ∣ x t ) , p 2 ( y ∣ x t ) ) ] L_{adv}(X_t)=E_{x_t-X_t}{[d{(p_1(y|x_t),p_2(y|x_t))}]} Ladv(Xt)=Ext−Xt[d(p1(y∣xt),p2(y∣xt))]
-
-
Step C
-
-
损失函数
- m i n G L a d v ( X t ) min_{G} L_{adv}(X_t) minGLadv(Xt)
-
3、Experiments on Classifification
- 结果图
-
参数设置
-
Optim: Adam
-
Learning rate: 0.0002
-
batch size : 128
-
hyper-parameter : num_k (2 - 4)
-
论文复现
- 《Maximum Classififier Discrepancy for Unsupervised Domain Adaptation》
一、源码分析
1. 网络模型
-
svhn to mnist(model)
-
G: conv1 => bn1 => max_pool
=> conv2 => bn2 => max_pool
=> conv3 => bn3 =>
=> fc1 => bn1_fc => dropout
=> out
-
F:fc1 = > bn1_fc => fc2 = > bn2_fc => fc3 = > bn3_fc =>out
-
不太明白的地方
- 在F结构中的forward加了一个reverse的选项。
-
- grad_reverse函数
-
syn to gtsrb
-
G: conv1 => bn1 => max_pool
=> conv2 => bn2 => max_pool
=> conv3 => bn3 => max_pool
=> view(拉平)
=> out
-
F: => fc2 = > bn2_fc => fc3 = > bn3_fc => out
-
-
usps
-
G: conv1 => bn1 => max_pool
=> conv2 => bn2 => max_pool
=> view(拉平)
=> out
-
F: fc1 = > bn1_fc => fc2 = > bn2_fc => fc3 = > bn3_fc => out
-
二、实验结果
METHOD | SVHN to MNIST | SYNSIG to GTSRB | MNIST to USPS | USPS to MNIST |
---|---|---|---|---|
Source Only | 70.1 | 92.5 | 68.6 | 58.3 |
MCD(n = 4) | 95.52 ± 0.58 | 95.75 ± 0.46 | 94.06 ± 0.36 | 96.79 ± 0.31 |