ImageNet(http://www.image-net.org)是李菲菲组的图像库,和WordNet 可以结合使用 (毕业于Caltech;导师:Pietro Perona;主页:http://vision.stanford.edu/~feifeili/)
总共有十万的synset, 其中2010的数据表示,有图像的非空synset是21841,每一类大约1000张图片,图片总数:14197122。
Caffe中训练ImageNet使用的是Alex Krizhevsky的方法(AlexNet),这个有60million参数的模型(简称为AlexNet),在ILSVRC 2012中赢得了第一名,Top5错误率15.3%。
2013年,Clarifai通过cnn模型可视化技术调整网络架构,赢得了ILSVRC (http://www.clarifai.com/ 看八卦 可以看zhihu.com Filestorm:的一段闲聊: http://zhuanlan.zhihu.com/cvprnet/19821292)。
2014年,google也加入进来,它通过增加模型的层数(总共22层),并且利用multi-scale data training,取得第一名,Top5错误率 6.66%。Google的模型大大增加的网络的深度,并且去掉了最顶层的全连接层:因为全连接层(Fully Connected)几乎占据了CNN大概90%的参数,但是同时又可能带来过拟合(overfitting)的效果。它的模型比以前AlexNet的模型大大缩小,并且减轻了过拟合带来的副作用。Alex模型参数是60M,GoogLeNet只有7M。
2015年,百度用Deep Image,Top5错误率达到5.98% ,它基于GPU,利用36个服务节点开发了一个专为深度学习运算的supercompter(名叫Minwa,敏娲)。这台supercomputer具备TB级的host memory,超强的数据交换能力,训练了一个有100billion参数的巨大的深层神经网络。 (这一系统包含 36 个服务器节点,每一服务器节点配备了 2 颗六核英特尔至强 E5-2620 处理器。每个服务器包含 4 颗英伟达 Tesla K40m GPU,以及 1 个 FDR
InfiniBand(速度为 56GB/S)。这带来了高性能、低延时的连接,以及对 RDMA 的支持。每一颗 GPU 的最高浮点运算性能为每秒 4.29 万亿次浮点运算,而每一颗 GPU 也配备了 12GB 的内存。整体来看,Minwa 内置了 6.9TB 的主内存、1.7TB 的设备内存,而理论上的最高性能约为 0.6 千万亿次浮点运算。凭借如此强大的系统,研究人员可以使用与其他深度学习项目不同,或者说更好的训练数据。因此,百度没有使用常见的 256x256 像素图片,而是使用了 512x512 像素图片,并且可以给这些图片添加各种特效,例如色彩调整、增加光晕,以及透镜扭曲等。这样做的目的是使系统学习更多尺寸更小的对象,并在各种环境下识别
对象。)
AlexNet整个模型比起之前我们看到的Cifar10 和LeNet模型相对来说复杂一些, 训练时间是在两台GTX 580 3GB GPUs上进行了5-6天的训练,其中用到了Hinton的改进方法(在全连接层加入ReLU+Dropout), 对于这个模型,我们首先认定一个完整的卷积层可能包括一层convolution,一层Rectified Linear Units,一层max-pooling,一层normalization。则整个网络结构包括五层卷积层和三层全连接层,网络的最前端是输入图片的原始像素点,最后端是图片的分类结果。
对于每一层网络,具体的网络参数配置如下图所示。InputLayer就是输入图片层,每个输入图片都将被缩放成227*227大小,分rgb三个颜色维度输入。Layer1~ Layer5是卷积层,以Layer1为例,卷积滤波器的大小是11*11,卷积步幅为4,本层共有96个卷积滤波器,本层的输出则是96个55*55大小的图片。在Layer1,卷积滤波后,还接有ReLUs操作和max-pooling操作。Layer6~ Layer8是全连接层,相当于在五层卷积层的基础上再加上一个三层的全连接神经网络分类器。以Layer6为例,本层的神经元个数为4096个。Layer8的神经元个数为1000个,相当于训练目标的1000个图片类别。
下面的图来自sunbaigui的 博文 http://blog.csdn.net/sunbaigui/article/details/39938097
流程是一致的,不过对于每个值的计算过程 有点不一样。
conv1阶段DFD(data flow diagram):
2. conv2阶段DFD(data flow diagram):
3. conv3阶段DFD(data flow diagram):
4. conv4阶段DFD(data flow diagram):
5. conv5阶段DFD(data flow diagram):
6. fc6阶段DFD(data flow diagram):
7. fc7阶段DFD(data flow diagram):
8. fc8阶段DFD(data flow diagram):
然后我调用训练好的模型
MODEL_FILE = '../models/bvlc_reference_caffenet/deploy.prototxt'
PRETRAINED = '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
获取到预测结果,然后再标签里查找到对应的分类
imagenet_label = caffe_root + 'data/ilsvrc12/synset_words.txt'
输入一张猫的图片
输出前五个分类
['n02123045 tabby, tabby cat' 'n02127052 lynx, catamount'
'n02124075 Egyptian cat' 'n07930864 cup' 'n02123159 tiger cat']
[281 287 285 968 282]
input: "data"
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
I0318 02:36:31.173322 15756 net.cpp:358] Input 0 -> data
I0318 02:36:31.173369 15756 net.cpp:56] Memory required for data: 0
I0318 02:36:31.173436 15756 net.cpp:67] Creating Layer conv1
I0318 02:36:31.173446 15756 net.cpp:394] conv1 <- data
I0318 02:36:31.173463 15756 net.cpp:356] conv1 -> conv1
I0318 02:36:31.173481 15756 net.cpp:96] Setting up conv1
I0318 02:36:31.173616 15756 net.cpp:103] Top shape: 10 96 55 55 (2904000)
I0318 02:36:31.173624 15756 net.cpp:113] Memory required for data: 11616000
I0318 02:36:31.173655 15756 net.cpp:67] Creating Layer relu1
I0318 02:36:31.173665 15756 net.cpp:394] relu1 <- conv1
I0318 02:36:31.173679 15756 net.cpp:345] relu1 -> conv1 (in-place)
I0318 02:36:31.173691 15756 net.cpp:96] Setting up relu1
I0318 02:36:31.173697 15756 net.cpp:103] Top shape: 10 96 55 55 (2904000)
I0318 02:36:31.173702 15756 net.cpp:113] Memory required for data: 23232000
I0318 02:36:31.173715 15756 net.cpp:67] Creating Layer pool1
I0318 02:36:31.173722 15756 net.cpp:394] pool1 <- conv1
I0318 02:36:31.173737 15756 net.cpp:356] pool1 -> pool1
I0318 02:36:31.173750 15756 net.cpp:96] Setting up pool1
I0318 02:36:31.173763 15756 net.cpp:103] Top shape: 10 96 27 27 (699840)
I0318 02:36:31.173768 15756 net.cpp:113] Memory required for data: 26031360
I0318 02:36:31.173780 15756 net.cpp:67] Creating Layer norm1
I0318 02:36:31.173787 15756 net.cpp:394] norm1 <- pool1
I0318 02:36:31.173800 15756 net.cpp:356] norm1 -> norm1
I0318 02:36:31.173811 15756 net.cpp:96] Setting up norm1
I0318 02:36:31.173820 15756 net.cpp:103] Top shape: 10 96 27 27 (699840)
I0318 02:36:31.173825 15756 net.cpp:113] Memory required for data: 28830720
I0318 02:36:31.173837 15756 net.cpp:67] Creating Layer conv2
I0318 02:36:31.173845 15756 net.cpp:394] conv2 <- norm1
I0318 02:36:31.173858 15756 net.cpp:356] conv2 -> conv2
I0318 02:36:31.173872 15756 net.cpp:96] Setting up conv2
I0318 02:36:31.174792 15756 net.cpp:103] Top shape: 10 256 27 27 (1866240)
I0318 02:36:31.174799 15756 net.cpp:113] Memory required for data: 36295680
I0318 02:36:31.174818 15756 net.cpp:67] Creating Layer relu2
I0318 02:36:31.174826 15756 net.cpp:394] relu2 <- conv2
I0318 02:36:31.174839 15756 net.cpp:345] relu2 -> conv2 (in-place)
I0318 02:36:31.174849 15756 net.cpp:96] Setting up relu2
I0318 02:36:31.174856 15756 net.cpp:103] Top shape: 10 256 27 27 (1866240)
I0318 02:36:31.174860 15756 net.cpp:113] Memory required for data: 43760640
I0318 02:36:31.174870 15756 net.cpp:67] Creating Layer pool2
I0318 02:36:31.174876 15756 net.cpp:394] pool2 <- conv2
I0318 02:36:31.174890 15756 net.cpp:356] pool2 -> pool2
I0318 02:36:31.174902 15756 net.cpp:96] Setting up pool2
I0318 02:36:31.174912 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
I0318 02:36:31.174917 15756 net.cpp:113] Memory required for data: 45491200
I0318 02:36:31.174927 15756 net.cpp:67] Creating Layer norm2
I0318 02:36:31.174933 15756 net.cpp:394] norm2 <- pool2
I0318 02:36:31.174947 15756 net.cpp:356] norm2 -> norm2
I0318 02:36:31.174957 15756 net.cpp:96] Setting up norm2
I0318 02:36:31.174967 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
I0318 02:36:31.174970 15756 net.cpp:113] Memory required for data: 47221760
I0318 02:36:31.174983 15756 net.cpp:67] Creating Layer conv3
I0318 02:36:31.174990 15756 net.cpp:394] conv3 <- norm2
I0318 02:36:31.175004 15756 net.cpp:356] conv3 -> conv3
I0318 02:36:31.175015 15756 net.cpp:96] Setting up conv3
I0318 02:36:31.177199 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
I0318 02:36:31.177208 15756 net.cpp:113] Memory required for data: 49817600
I0318 02:36:31.177230 15756 net.cpp:67] Creating Layer relu3
I0318 02:36:31.177238 15756 net.cpp:394] relu3 <- conv3
I0318 02:36:31.177252 15756 net.cpp:345] relu3 -> conv3 (in-place)
I0318 02:36:31.177263 15756 net.cpp:96] Setting up relu3
I0318 02:36:31.177268 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
I0318 02:36:31.177273 15756 net.cpp:113] Memory required for data: 52413440
I0318 02:36:31.177284 15756 net.cpp:67] Creating Layer conv4
I0318 02:36:31.177289 15756 net.cpp:394] conv4 <- conv3
I0318 02:36:31.177302 15756 net.cpp:356] conv4 -> conv4
I0318 02:36:31.177316 15756 net.cpp:96] Setting up conv4
I0318 02:36:31.179213 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
I0318 02:36:31.179220 15756 net.cpp:113] Memory required for data: 55009280
I0318 02:36:31.179237 15756 net.cpp:67] Creating Layer relu4
I0318 02:36:31.179244 15756 net.cpp:394] relu4 <- conv4
I0318 02:36:31.179258 15756 net.cpp:345] relu4 -> conv4 (in-place)
I0318 02:36:31.179268 15756 net.cpp:96] Setting up relu4
I0318 02:36:31.179275 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
I0318 02:36:31.179280 15756 net.cpp:113] Memory required for data: 57605120
I0318 02:36:31.179291 15756 net.cpp:67] Creating Layer conv5
I0318 02:36:31.179296 15756 net.cpp:394] conv5 <- conv4
I0318 02:36:31.179309 15756 net.cpp:356] conv5 -> conv5
I0318 02:36:31.179322 15756 net.cpp:96] Setting up conv5
I0318 02:36:31.180598 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
I0318 02:36:31.180606 15756 net.cpp:113] Memory required for data: 59335680
I0318 02:36:31.180629 15756 net.cpp:67] Creating Layer relu5
I0318 02:36:31.180636 15756 net.cpp:394] relu5 <- conv5
I0318 02:36:31.180649 15756 net.cpp:345] relu5 -> conv5 (in-place)
I0318 02:36:31.180660 15756 net.cpp:96] Setting up relu5
I0318 02:36:31.180665 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
I0318 02:36:31.180670 15756 net.cpp:113] Memory required for data: 61066240
I0318 02:36:31.180682 15756 net.cpp:67] Creating Layer pool5
I0318 02:36:31.180690 15756 net.cpp:394] pool5 <- conv5
I0318 02:36:31.180703 15756 net.cpp:356] pool5 -> pool5
I0318 02:36:31.180716 15756 net.cpp:96] Setting up pool5
I0318 02:36:31.180726 15756 net.cpp:103] Top shape: 10 256 6 6 (92160)
I0318 02:36:31.180730 15756 net.cpp:113] Memory required for data: 61434880
I0318 02:36:31.180742 15756 net.cpp:67] Creating Layer fc6
I0318 02:36:31.180748 15756 net.cpp:394] fc6 <- pool5
I0318 02:36:31.180760 15756 net.cpp:356] fc6 -> fc6
I0318 02:36:31.180773 15756 net.cpp:96] Setting up fc6
I0318 02:36:31.262177 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
I0318 02:36:31.262205 15756 net.cpp:113] Memory required for data: 61598720
I0318 02:36:31.262254 15756 net.cpp:67] Creating Layer relu6
I0318 02:36:31.262269 15756 net.cpp:394] relu6 <- fc6
I0318 02:36:31.262296 15756 net.cpp:345] relu6 -> fc6 (in-place)
I0318 02:36:31.262312 15756 net.cpp:96] Setting up relu6
I0318 02:36:31.262320 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
I0318 02:36:31.262325 15756 net.cpp:113] Memory required for data: 61762560
I0318 02:36:31.262341 15756 net.cpp:67] Creating Layer drop6
I0318 02:36:31.262346 15756 net.cpp:394] drop6 <- fc6
I0318 02:36:31.262358 15756 net.cpp:345] drop6 -> fc6 (in-place)
I0318 02:36:31.262369 15756 net.cpp:96] Setting up drop6
I0318 02:36:31.262378 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
I0318 02:36:31.262382 15756 net.cpp:113] Memory required for data: 61926400
I0318 02:36:31.262408 15756 net.cpp:67] Creating Layer fc7
I0318 02:36:31.262416 15756 net.cpp:394] fc7 <- fc6
I0318 02:36:31.262430 15756 net.cpp:356] fc7 -> fc7
I0318 02:36:31.262445 15756 net.cpp:96] Setting up fc7
I0318 02:36:31.298831 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
I0318 02:36:31.298859 15756 net.cpp:113] Memory required for data: 62090240
I0318 02:36:31.298895 15756 net.cpp:67] Creating Layer relu7
I0318 02:36:31.298908 15756 net.cpp:394] relu7 <- fc7
I0318 02:36:31.298933 15756 net.cpp:345] relu7 -> fc7 (in-place)
I0318 02:36:31.298949 15756 net.cpp:96] Setting up relu7
I0318 02:36:31.298955 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
I0318 02:36:31.298960 15756 net.cpp:113] Memory required for data: 62254080
I0318 02:36:31.298974 15756 net.cpp:67] Creating Layer drop7
I0318 02:36:31.298981 15756 net.cpp:394] drop7 <- fc7
I0318 02:36:31.298993 15756 net.cpp:345] drop7 -> fc7 (in-place)
I0318 02:36:31.299003 15756 net.cpp:96] Setting up drop7
I0318 02:36:31.299011 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
I0318 02:36:31.299016 15756 net.cpp:113] Memory required for data: 62417920
I0318 02:36:31.299028 15756 net.cpp:67] Creating Layer fc8
I0318 02:36:31.299034 15756 net.cpp:394] fc8 <- fc7
I0318 02:36:31.299046 15756 net.cpp:356] fc8 -> fc8
I0318 02:36:31.299060 15756 net.cpp:96] Setting up fc8
I0318 02:36:31.308233 15756 net.cpp:103] Top shape: 10 1000 1 1 (10000)
I0318 02:36:31.308260 15756 net.cpp:113] Memory required for data: 62457920
I0318 02:36:31.308295 15756 net.cpp:67] Creating Layer prob
I0318 02:36:31.308310 15756 net.cpp:394] prob <- fc8
I0318 02:36:31.308336 15756 net.cpp:356] prob -> prob
I0318 02:36:31.308353 15756 net.cpp:96] Setting up prob
I0318 02:36:31.308372 15756 net.cpp:103] Top shape: 10 1000 1 1 (10000)
I0318 02:36:31.308377 15756 net.cpp:113] Memory required for data: 62497920
I0318 02:36:31.308384 15756 net.cpp:172] prob does not need backward computation.
I0318 02:36:31.308404 15756 net.cpp:172] fc8 does not need backward computation.
I0318 02:36:31.308410 15756 net.cpp:172] drop7 does not need backward computation.
I0318 02:36:31.308415 15756 net.cpp:172] relu7 does not need backward computation.
I0318 02:36:31.308420 15756 net.cpp:172] fc7 does not need backward computation.
I0318 02:36:31.308425 15756 net.cpp:172] drop6 does not need backward computation.
I0318 02:36:31.308430 15756 net.cpp:172] relu6 does not need backward computation.
I0318 02:36:31.308435 15756 net.cpp:172] fc6 does not need backward computation.
I0318 02:36:31.308440 15756 net.cpp:172] pool5 does not need backward computation.
I0318 02:36:31.308445 15756 net.cpp:172] relu5 does not need backward computation.
I0318 02:36:31.308450 15756 net.cpp:172] conv5 does not need backward computation.
I0318 02:36:31.308454 15756 net.cpp:172] relu4 does not need backward computation.
I0318 02:36:31.308459 15756 net.cpp:172] conv4 does not need backward computation.
I0318 02:36:31.308465 15756 net.cpp:172] relu3 does not need backward computation.
I0318 02:36:31.308470 15756 net.cpp:172] conv3 does not need backward computation.
I0318 02:36:31.308473 15756 net.cpp:172] norm2 does not need backward computation.
I0318 02:36:31.308478 15756 net.cpp:172] pool2 does not need backward computation.
I0318 02:36:31.308483 15756 net.cpp:172] relu2 does not need backward computation.
I0318 02:36:31.308488 15756 net.cpp:172] conv2 does not need backward computation.
I0318 02:36:31.308493 15756 net.cpp:172] norm1 does not need backward computation.
I0318 02:36:31.308498 15756 net.cpp:172] pool1 does not need backward computation.
I0318 02:36:31.308502 15756 net.cpp:172] relu1 does not need backward computation.
I0318 02:36:31.308508 15756 net.cpp:172] conv1 does not need backward computation.
I0318 02:36:31.308512 15756 net.cpp:208] This network produces output prob
I0318 02:36:31.308542 15756 net.cpp:467] Collecting Learning Rate and Weight Decay.
I0318 02:36:31.308554 15756 net.cpp:219] Network initialization done.
I0318 02:36:31.308558 15756 net.cpp:220] Memory required for data: 62497920
E0318 02:36:31.683995 15756 upgrade_proto.cpp:611] Attempting to upgrade input file specified using deprecated transformation parameters: ../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
I0318 02:36:31.684034 15756 upgrade_proto.cpp:614] Successfully upgraded file specified using deprecated data transformation parameters.
E0318 02:36:31.684039 15756 upgrade_proto.cpp:616] Note that future Caffe releases will only support transform_param messages for transformation fields.
I0318 02:36:31.684048 15756 net.cpp:702] Ignoring source layer data
I0318 02:36:31.684052 15756 net.cpp:705] Copying source layer conv1
I0318 02:36:31.684367 15756 net.cpp:705] Copying source layer relu1
I0318 02:36:31.684375 15756 net.cpp:705] Copying source layer pool1
I0318 02:36:31.684380 15756 net.cpp:705] Copying source layer norm1
I0318 02:36:31.684383 15756 net.cpp:705] Copying source layer conv2
I0318 02:36:31.687021 15756 net.cpp:705] Copying source layer relu2
I0318 02:36:31.687028 15756 net.cpp:705] Copying source layer pool2
I0318 02:36:31.687033 15756 net.cpp:705] Copying source layer norm2
I0318 02:36:31.687038 15756 net.cpp:705] Copying source layer conv3
I0318 02:36:31.694574 15756 net.cpp:705] Copying source layer relu3
I0318 02:36:31.694586 15756 net.cpp:705] Copying source layer conv4
I0318 02:36:31.700258 15756 net.cpp:705] Copying source layer relu4
I0318 02:36:31.700266 15756 net.cpp:705] Copying source layer conv5
I0318 02:36:31.704053 15756 net.cpp:705] Copying source layer relu5
I0318 02:36:31.704062 15756 net.cpp:705] Copying source layer pool5
I0318 02:36:31.704067 15756 net.cpp:705] Copying source layer fc6
I0318 02:36:32.025676 15756 net.cpp:705] Copying source layer relu6
I0318 02:36:32.025704 15756 net.cpp:705] Copying source layer drop6
I0318 02:36:32.025709 15756 net.cpp:705] Copying source layer fc7
I0318 02:36:32.168704 15756 net.cpp:705] Copying source layer relu7
I0318 02:36:32.168730 15756 net.cpp:705] Copying source layer drop7
I0318 02:36:32.168735 15756 net.cpp:705] Copying source layer fc8
I0318 02:36:32.203642 15756 net.cpp:702] Ignoring source layer loss
['n02123045 tabby, tabby cat' 'n02127052 lynx, catamount'
'n02124075 Egyptian cat' 'n07930864 cup' 'n02123159 tiger cat']
[281 287 285 968 282]
其他参考:
http://blog.csdn.net/sunbaigui/article/details/39938097
http://dataunion.org/10781.html
http://djt.qq.com/article/view/1245