在神经网络中,处于底层的特征层提取到的特征具有足够的通用性,可以被用于不同的视觉任务;处于较高的特征层提取的特征具有很强的语义信息;
根据不同层的损失函数,计算网络整体的损失函数时,需要根据每个层的损失值设定不同的权值,求解整个网络总体的损失函数;
计算损失函数时候,不仅可以利用特征图的值进行距离损失函数计算,也可以考虑其他特征的损失函数计算方式,比如梯度,方向等;
Abstract
Anomaly detection is a challenging task and usually formulated as an unsupervised learning problem for the unexpectedness of anomalies. This paper proposes a simple yet powerful approach to this issue, which is implemented in the student-teacher framework for its advantages but substantially extends it in terms of both accuracy and efficiency.Given a strong model pre-trained on image classification as the teacher, we distill the knowledge into a single student network with the identical architecture to learn the distribution of anomaly-free images and this one-step transfer preserves the crucial clues as much as possible. Moreover,we integrate the multi-scale feature matching strategy into the framework, and this hierarchical feature alignment enables the student network to receive a mixture of multi-level knowledge from the feature pyramid under better supervision, thus allowing to detect anomalies of various sizes. The difference between feature pyramids generated by the two networks serves as a scoring function indicating the probability of anomaly occurring. Due to such operations, our approach achieves accurate and fast pixel-level anomaly detection. Very competitive results are delivered on three major benchmarks, significantly superior to the state of the art ones. In addition, it makes inferences at a very high speed (with 100 FPS for images of the size at 256×256), at least dozens of times faster than the latest counterparts.
网络架构如下:
最终的得分图是将不同大小的得分图上采样为原始输入图像的大小,每个像素的的得分值为不同得分图对应像素值的乘积;将得分图中的最大值作为测试的图像的异常得分值;
-
The parameters of the teacher network take the corresponding values from the ResNet-18 pretrained on ImageNet while those of student network are initialized randomly.
-
we choose the first three blocks (i.e., conv2 x, conv3 x, conv4 x) of ResNet-18 as the pyramid feature extractors.
-
resize all images in the MVTec-AD and STC datasets to 256×256.
损失函数
:feature vectors at position (i; j) in thefeature maps from the teacher
:feature vectors at position (i; j) in thefeature maps from the student
Experiment Settings
- we choose the first three blocks (i.e., conv2 x, conv3 x, conv4 x) of ResNet-18 as the pyramid feature extractors. The parameters of the teacher network take the corresponding values from the ResNet-18 pretrained on ImageNet* while those of student network are initialized randomly. This is in good agreement with the previous discovery that the middle-level features play a more important role in knowledge transfer.
- Data augmentation is not used, since some standard augmentation techniques may lead to ambiguous determination of anomalies.
Conclusion
We present a new feature pyramid matching technique and incorporate it into the student-teacher anomaly detection framework. Given a powerful network pretrained on image classification as the teacher, we use its different levels of features to guide a student network with same structure to learn the distribution of anomaly-free images. The anomaly scoring function of a test image can be defined as the difference between feature pyramids generated by the two models.This one-step knowledge transfer largely simplifies the detection procedure.