Convolutional Neural Networks (CNNs) are responsible for the major breakthroughs in image recognition made in the past few years. In this chapter we will cover:
- Implementing a Simpler CNN
- Implementing an Advanced CNN
- Retraining Existing CNN models
- Applying Stylenet/Neural-Style
- Implementing DeepDream
熟悉基本操作后,转入Model的具体实践,以及可视化。
Let's look at how to achieve it by Tensorflow.
Log赏析:
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting temp/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting temp/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting temp/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting temp/t10k-labels-idx1-ubyte.gz
Generation # 5. Train Loss: 2.28. Train Acc (Test Acc): 14.00 (14.00)
Generation # 10. Train Loss: 2.22. Train Acc (Test Acc): 14.00 (22.00)
Generation # 15. Train Loss: 2.11. Train Acc (Test Acc): 40.00 (33.40)
Generation # 20. Train Loss: 2.05. Train Acc (Test Acc): 48.00 (50.00)
Generation # 25. Train Loss: 1.93. Train Acc (Test Acc): 52.00 (57.00)
Generation # 30. Train Loss: 1.69. Train Acc (Test Acc): 64.00 (62.60)
Generation # 35. Train Loss: 1.43. Train Acc (Test Acc): 67.00 (64.40)
Generation # 40. Train Loss: 1.22. Train Acc (Test Acc): 63.00 (70.80)
Generation # 45. Train Loss: 0.87. Train Acc (Test Acc): 82.00 (76.80)
Generation # 50. Train Loss: 0.76. Train Acc (Test Acc): 80.00 (77.20)
Generation # 55. Train Loss: 0.66. Train Acc (Test Acc): 80.00 (75.40)
Generation # 60. Train Loss: 0.59. Train Acc (Test Acc): 81.00 (80.80)
Generation # 65. Train Loss: 0.55. Train Acc (Test Acc): 79.00 (85.60)
Generation # 70. Train Loss: 0.41. Train Acc (Test Acc): 85.00 (81.80)
Generation # 75. Train Loss: 0.57. Train Acc (Test Acc): 83.00 (85.20)
Generation # 80. Train Loss: 0.39. Train Acc (Test Acc): 90.00 (86.00)
Generation # 85. Train Loss: 0.39. Train Acc (Test Acc): 90.00 (86.00)
Generation # 90. Train Loss: 0.26. Train Acc (Test Acc): 92.00 (90.60)
Generation # 95. Train Loss: 0.32. Train Acc (Test Acc): 90.00 (87.60)
Generation # 100. Train Loss: 0.39. Train Acc (Test Acc): 89.00 (89.80)
Generation # 105. Train Loss: 0.49. Train Acc (Test Acc): 85.00 (90.00)
Generation # 110. Train Loss: 0.34. Train Acc (Test Acc): 88.00 (90.00)
Generation # 115. Train Loss: 0.24. Train Acc (Test Acc): 91.00 (89.20)
Generation # 120. Train Loss: 0.30. Train Acc (Test Acc): 92.00 (91.40)
Generation # 125. Train Loss: 0.29. Train Acc (Test Acc): 89.00 (91.60)
Generation # 130. Train Loss: 0.31. Train Acc (Test Acc): 93.00 (90.20)
Generation # 135. Train Loss: 0.41. Train Acc (Test Acc): 85.00 (91.40)
Generation # 140. Train Loss: 0.22. Train Acc (Test Acc): 94.00 (91.40)
Generation # 145. Train Loss: 0.39. Train Acc (Test Acc): 85.00 (92.60)
Generation # 150. Train Loss: 0.38. Train Acc (Test Acc): 93.00 (90.00)
Generation # 155. Train Loss: 0.17. Train Acc (Test Acc): 96.00 (91.60)
Generation # 160. Train Loss: 0.22. Train Acc (Test Acc): 93.00 (93.20)
Generation # 165. Train Loss: 0.15. Train Acc (Test Acc): 97.00 (92.00)
Generation # 170. Train Loss: 0.24. Train Acc (Test Acc): 92.00 (93.40)
Generation # 175. Train Loss: 0.21. Train Acc (Test Acc): 93.00 (92.40)
Generation # 180. Train Loss: 0.35. Train Acc (Test Acc): 90.00 (91.80)
Generation # 185. Train Loss: 0.15. Train Acc (Test Acc): 95.00 (93.80)
Generation # 190. Train Loss: 0.17. Train Acc (Test Acc): 96.00 (91.60)
Generation # 195. Train Loss: 0.26. Train Acc (Test Acc): 89.00 (92.20)
Generation # 200. Train Loss: 0.32. Train Acc (Test Acc): 91.00 (91.40)
Generation # 205. Train Loss: 0.20. Train Acc (Test Acc): 93.00 (93.60)
Generation # 210. Train Loss: 0.16. Train Acc (Test Acc): 97.00 (93.80)
Generation # 215. Train Loss: 0.18. Train Acc (Test Acc): 95.00 (91.60)
Generation # 220. Train Loss: 0.21. Train Acc (Test Acc): 96.00 (92.40)
Generation # 225. Train Loss: 0.23. Train Acc (Test Acc): 93.00 (94.80)
Generation # 230. Train Loss: 0.16. Train Acc (Test Acc): 97.00 (96.60)
Generation # 235. Train Loss: 0.19. Train Acc (Test Acc): 93.00 (94.80)
Generation # 240. Train Loss: 0.12. Train Acc (Test Acc): 97.00 (95.20)
Generation # 245. Train Loss: 0.16. Train Acc (Test Acc): 96.00 (92.20)
Generation # 250. Train Loss: 0.22. Train Acc (Test Acc): 92.00 (93.80)
Generation # 255. Train Loss: 0.22. Train Acc (Test Acc): 94.00 (95.00)
Generation # 260. Train Loss: 0.22. Train Acc (Test Acc): 90.00 (93.40)
Generation # 265. Train Loss: 0.23. Train Acc (Test Acc): 93.00 (94.40)
Generation # 270. Train Loss: 0.11. Train Acc (Test Acc): 96.00 (92.00)
Generation # 275. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (93.20)
Generation # 280. Train Loss: 0.17. Train Acc (Test Acc): 97.00 (95.60)
Generation # 285. Train Loss: 0.25. Train Acc (Test Acc): 90.00 (95.20)
Generation # 290. Train Loss: 0.17. Train Acc (Test Acc): 95.00 (95.80)
Generation # 295. Train Loss: 0.20. Train Acc (Test Acc): 93.00 (96.40)
Generation # 300. Train Loss: 0.12. Train Acc (Test Acc): 96.00 (93.60)
Generation # 305. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (94.20)
Generation # 310. Train Loss: 0.37. Train Acc (Test Acc): 88.00 (94.80)
Generation # 315. Train Loss: 0.19. Train Acc (Test Acc): 96.00 (93.40)
Generation # 320. Train Loss: 0.17. Train Acc (Test Acc): 95.00 (96.20)
Generation # 325. Train Loss: 0.16. Train Acc (Test Acc): 92.00 (93.80)
Generation # 330. Train Loss: 0.17. Train Acc (Test Acc): 96.00 (94.00)
Generation # 335. Train Loss: 0.14. Train Acc (Test Acc): 96.00 (95.20)
Generation # 340. Train Loss: 0.15. Train Acc (Test Acc): 96.00 (96.60)
Generation # 345. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (95.60)
Generation # 350. Train Loss: 0.27. Train Acc (Test Acc): 91.00 (97.00)
Generation # 355. Train Loss: 0.11. Train Acc (Test Acc): 98.00 (94.60)
Generation # 360. Train Loss: 0.15. Train Acc (Test Acc): 95.00 (95.20)
Generation # 365. Train Loss: 0.08. Train Acc (Test Acc): 98.00 (96.40)
Generation # 370. Train Loss: 0.15. Train Acc (Test Acc): 94.00 (93.80)
Generation # 375. Train Loss: 0.21. Train Acc (Test Acc): 92.00 (96.60)
Generation # 380. Train Loss: 0.21. Train Acc (Test Acc): 96.00 (94.40)
Generation # 385. Train Loss: 0.07. Train Acc (Test Acc): 99.00 (95.40)
Generation # 390. Train Loss: 0.19. Train Acc (Test Acc): 94.00 (95.40)
Generation # 395. Train Loss: 0.12. Train Acc (Test Acc): 97.00 (94.40)
Generation # 400. Train Loss: 0.14. Train Acc (Test Acc): 96.00 (96.60)
Generation # 405. Train Loss: 0.17. Train Acc (Test Acc): 95.00 (96.60)
Generation # 410. Train Loss: 0.16. Train Acc (Test Acc): 93.00 (96.40)
Generation # 415. Train Loss: 0.18. Train Acc (Test Acc): 93.00 (95.60)
Generation # 420. Train Loss: 0.11. Train Acc (Test Acc): 95.00 (94.80)
Generation # 425. Train Loss: 0.22. Train Acc (Test Acc): 91.00 (95.20)
Generation # 430. Train Loss: 0.07. Train Acc (Test Acc): 98.00 (96.20)
Generation # 435. Train Loss: 0.11. Train Acc (Test Acc): 97.00 (95.80)
Generation # 440. Train Loss: 0.07. Train Acc (Test Acc): 97.00 (95.20)
Generation # 445. Train Loss: 0.15. Train Acc (Test Acc): 99.00 (97.80)
Generation # 450. Train Loss: 0.09. Train Acc (Test Acc): 98.00 (95.00)
Generation # 455. Train Loss: 0.07. Train Acc (Test Acc): 97.00 (95.80)
Generation # 460. Train Loss: 0.08. Train Acc (Test Acc): 98.00 (94.60)
Generation # 465. Train Loss: 0.07. Train Acc (Test Acc): 98.00 (95.40)
Generation # 470. Train Loss: 0.14. Train Acc (Test Acc): 98.00 (94.40)
Generation # 475. Train Loss: 0.24. Train Acc (Test Acc): 93.00 (96.40)
Generation # 480. Train Loss: 0.08. Train Acc (Test Acc): 99.00 (94.40)
Generation # 485. Train Loss: 0.16. Train Acc (Test Acc): 96.00 (96.40)
Generation # 490. Train Loss: 0.09. Train Acc (Test Acc): 95.00 (96.40)
Generation # 495. Train Loss: 0.13. Train Acc (Test Acc): 95.00 (96.20)
Generation # 500. Train Loss: 0.09. Train Acc (Test Acc): 99.00 (95.80)
Training...
Code解读:
- 加载数据
# Introductory CNN Model: MNIST Digits
#---------------------------------------
#
# In this example, we will download the MNIST handwritten
# digits and create a simple CNN network to predict the
# digit category (0-9) import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets
from tensorflow.python.framework import ops
ops.reset_default_graph() # Start a graph session
sess = tf.Session() # Load data
data_dir = 'temp'
mnist = read_data_sets(data_dir) # Convert images into 28x28 (they are downloaded as 1x784)
train_xdata = np.array([np.reshape(x, (28,28)) for x in mnist.train.images])
test_xdata = np.array([np.reshape(x, (28,28)) for x in mnist.test.images]) # Convert labels into one-hot encoded vectors
train_labels = mnist.train.labels
test_labels = mnist.test.labels
train_xdata --> train_labels; test_xdata --> test_labels
- Set model parameters
# Set model parameters
batch_size = 100
learning_rate = 0.005
evaluation_size = 500
image_width = train_xdata[0].shape[0]
image_height = train_xdata[0].shape[1]
target_size = max(train_labels) + 1
num_channels = 1 # greyscale = 1 channel
generations = 500
eval_every = 5
conv1_features = 25
conv2_features = 50
max_pool_size1 = 2 # NxN window for 1st max pool layer
max_pool_size2 = 2 # NxN window for 2nd max pool layer
fully_connected_size1 = 100
Set model parameters
- 构建Graph连接
# Declare model placeholders
x_input_shape = (batch_size, image_width, image_height, num_channels) # (, 28, 28, 1)
x_input = tf.placeholder(tf.float32, shape=x_input_shape)
y_target = tf.placeholder(tf.int32, shape=(batch_size)) # Test/Evaluation
eval_input_shape = (evaluation_size, image_width, image_height, num_channels) # (, 28, 28, 1)
eval_input = tf.placeholder(tf.float32, shape=eval_input_shape)
eval_target = tf.placeholder(tf.int32, shape=(evaluation_size)) # Declare model parameters
# For one layer
conv1_weight = tf.Variable(tf.truncated_normal([4, 4, num_channels, conv1_features], stddev=0.1, dtype=tf.float32)) # conv1_features = 25 conv kernels
conv1_bias = tf.Variable(tf.zeros([conv1_features], dtype=tf.float32))
# For another layer
conv2_weight = tf.Variable(tf.truncated_normal([4, 4, conv1_features, conv2_features], stddev=0.1, dtype=tf.float32)) # conv2_features = 50 conv kernels
conv2_bias = tf.Variable(tf.zeros([conv2_features], dtype=tf.float32)) # fully connected variables
resulting_width = image_width // (max_pool_size1 * max_pool_size2)
resulting_height = image_height // (max_pool_size1 * max_pool_size2)
# Pooling两次后缩小为1/4 # For one layer
full1_input_size = resulting_width * resulting_height * conv2_features # 50个缩小后的feature map?总size?简直的人海战术
full1_weight = tf.Variable(tf.truncated_normal([full1_input_size, fully_connected_size1], stddev=0.1, dtype=tf.float32)) # fully_connected_size1 = 100
full1_bias = tf.Variable(tf.truncated_normal([ fully_connected_size1], stddev=0.1, dtype=tf.float32))
# For another layer
full2_weight = tf.Variable(tf.truncated_normal([fully_connected_size1, target_size], stddev=0.1, dtype=tf.float32))
full2_bias = tf.Variable(tf.truncated_normal([ target_size], stddev=0.1, dtype=tf.float32))
- 构建Graph结构
# Initialize Model Operations
def my_conv_net(input_data):
# First Conv-ReLU-MaxPool Layer
conv1 = tf.nn.conv2d(input_data, conv1_weight, strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu( tf.nn.bias_add(conv1, conv1_bias) )
max_pool1 = tf.nn.max_pool(relu1,
ksize = [1, max_pool_size1, max_pool_size1, 1],
strides = [1, max_pool_size1, max_pool_size1, 1],
padding = 'SAME') # Second Conv-ReLU-MaxPool Layer
conv2 = tf.nn.conv2d(max_pool1, conv2_weight, strides=[1, 1, 1, 1], padding='SAME')
relu2 = tf.nn.relu( tf.nn.bias_add(conv2, conv2_bias) )
max_pool2 = tf.nn.max_pool(relu2,
ksize = [1, max_pool_size2, max_pool_size2, 1],
strides = [1, max_pool_size2, max_pool_size2, 1],
padding = 'SAME') # Transform Output into a 1xN layer for next fully connected layer
final_conv_shape = max_pool2.get_shape().as_list()
final_shape = final_conv_shape[1] * final_conv_shape[2] * final_conv_shape[3]
flat_output = tf.reshape(max_pool2, [final_conv_shape[0], final_shape]) # First Fully Connected Layer
fully_connected1 = tf.nn.relu( tf.add(tf.matmul(flat_output, full1_weight), full1_bias) ) # Second Fully Connected Layer
final_model_output = tf.add( tf.matmul(fully_connected1, full2_weight), full2_bias ) return(final_model_output) model_output = my_conv_net(x_input)
test_model_output = my_conv_net(eval_input)
- Loss and solver
# Declare Loss Function (softmax cross entropy)
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(model_output, y_target))
# Create an optimizer
my_optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9)
train_step = my_optimizer.minimize(loss)
通过梯度下降减小loss,对于GD的优化方式采用了Momentum。
- 数据训练
# Create a prediction function
prediction = tf.nn.softmax(model_output)
test_prediction = tf.nn.softmax(test_model_output) # Create accuracy function
def get_accuracy(logits, targets):
batch_predictions = np.argmax(logits, axis=1)
num_correct = np.sum(np.equal(batch_predictions, targets))
return(100. * num_correct/batch_predictions.shape[0]) # Initialize Variables
init = tf.initialize_all_variables()
sess.run(init) # Start training loop
train_loss = []
train_acc = []
test_acc = []
for i in range(generations):
rand_index = np.random.choice(len(train_xdata), size=batch_size) # 首先,train_xdata是三维(样本th, h, w)
rand_x = train_xdata[rand_index]
rand_x = np.expand_dims(rand_x, 3) # 其次,这里增加了一维(第四维),是为了channel
rand_y = train_labels[rand_index]
train_dict = {x_input: rand_x, y_target: rand_y} # 毕竟,feed时,x_input已规定为四维(样本th, h, w, ch),这是卷积api的定义
# training...
sess.run(train_step, feed_dict=train_dict)
# for loss
temp_train_loss = sess.run(loss, feed_dict=train_dict)
# for accuracy
temp_train_preds = sess.run(predicton, feed_dict=train_dict)
temp_train_acc = get_accuracy(temp_train_preds, rand_y)
# 每次都要执行,这是当然;但只需要每五次打印一下log就可以了
# NB: 打印log的过程,其实也包含里TEST过程
if (i+1) % eval_every == 0:
eval_index = np.random.choice(len(test_xdata), size=evaluation_size)
eval_x = test_xdata[eval_index]
eval_x = np.expand_dims(eval_x, 3)
eval_y = test_labels[eval_index]
test_dict = {eval_input: eval_x, eval_target: eval_y}
test_preds = sess.run(test_prediction, feed_dict=test_dict)
temp_test_acc = get_accuracy(test_preds, eval_y) # Record and print results 可见,主要记录了三个指标(test时可不需要关注loss)
train_loss.append(temp_train_loss)
train_acc.append(temp_train_acc)
test_acc.append(temp_test_acc)
acc_and_loss = [(i+1), temp_train_loss, temp_train_acc, temp_test_acc]
acc_and_loss = [np.round(x,2) for x in acc_and_loss]
print('Generation # {}. Train Loss: {:.2f}. Train Acc (Test Acc): {:.2f} ({:.2f})'.format(*acc_and_loss))
- 结果展示
# Matlotlib code to plot the loss and accuracies
eval_indices = range(0, generations, eval_every)
# Plot loss over time
plt.plot(eval_indices, train_loss, 'k-')
plt.title('Softmax Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Softmax Loss')
plt.show() # Plot train and test accuracy
plt.plot(eval_indices, train_acc, 'k-', label='Train Set Accuracy')
plt.plot(eval_indices, test_acc, 'r--', label='Test Set Accuracy')
plt.title('Train and Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.show() # Plot some samples
# Plot the 6 of the last batch results:
actuals = rand_y[0:6]
predictions = np.argmax(temp_train_preds,axis=1)[0:6]
images = np.squeeze(rand_x[0:6]) Nrows = 2
Ncols = 3
for i in range(6):
plt.subplot(Nrows, Ncols, i+1)
plt.imshow(np.reshape(images[i], [28,28]), cmap='Greys_r')
plt.title('Actual: ' + str(actuals[i]) + ' Pred: ' + str(predictions[i]),
fontsize=10)
frame = plt.gca()
frame.axes.get_xaxis().set_visible(False)
frame.axes.get_yaxis().set_visible(False)
NB: 这个比 tf.nn.conv2d 好用了许多!
conv1 = tf.layers.conv2d(X,
convlayer_sizes[0], # 使用几个filter卷积出多少个map
kernel_size =filter_shape,
padding =padding,
activation =tf.nn.relu,
bias_initializer =tf.zeros_initializer(),
kernel_regularizer=tf.nn.l2_loss,
bias_regularizer =tf.nn.l2_loss,
name ="conv1")
运行效率也要高一些。
$ python train.py
Extracting data/mnist/train-images-idx3-ubyte.gz
Extracting data/mnist/train-labels-idx1-ubyte.gz
Extracting data/mnist/t10k-images-idx3-ubyte.gz
Extracting data/mnist/t10k-labels-idx1-ubyte.gz
Starting Training...
Epoch 0, Training Loss: 1.60813545597, Test accuracy: 0.938201121795, time: 5.14s, total time: 6.15s
Epoch 1, Training Loss: 1.52455213715, Test accuracy: 0.950721153846, time: 4.53s, total time: 11.64s
Epoch 2, Training Loss: 1.51011738271, Test accuracy: 0.964743589744, time: 4.59s, total time: 17.22s
Epoch 3, Training Loss: 1.50035703349, Test accuracy: 0.966746794872, time: 4.48s, total time: 22.65s
Epoch 4, Training Loss: 1.49455949921, Test accuracy: 0.965044070513, time: 4.53s, total time: 28.17s
Epoch 5, Training Loss: 1.49036714219, Test accuracy: 0.969851762821, time: 4.58s, total time: 33.72s
Epoch 6, Training Loss: 1.48730323059, Test accuracy: 0.973657852564, time: 4.46s, total time: 39.21s
Epoch 7, Training Loss: 1.48489333978, Test accuracy: 0.972155448718, time: 4.51s, total time: 44.72s
Epoch 8, Training Loss: 1.48286200987, Test accuracy: 0.975260416667, time: 4.52s, total time: 50.22s
Epoch 9, Training Loss: 1.48113634647, Test accuracy: 0.973657852564, time: 4.5s, total time: 55.69s
Epoch 10, Training Loss: 1.47969393491, Test accuracy: 0.974559294872, time: 4.56s, total time: 61.21s
Epoch 11, Training Loss: 1.47822354946, Test accuracy: 0.975360576923, time: 4.69s, total time: 66.84s
Epoch 12, Training Loss: 1.47717202571, Test accuracy: 0.975360576923, time: 4.61s, total time: 72.51s
Epoch 13, Training Loss: 1.47643559991, Test accuracy: 0.974859775641, time: 4.53s, total time: 77.97s
Epoch 14, Training Loss: 1.4753200641, Test accuracy: 0.977864583333, time: 4.66s, total time: 83.59s
Epoch 15, Training Loss: 1.47432387181, Test accuracy: 0.97796474359, time: 4.44s, total time: 88.93s
Epoch 16, Training Loss: 1.47376608154, Test accuracy: 0.978565705128, time: 4.52s, total time: 94.4s
Epoch 17, Training Loss: 1.47339749864, Test accuracy: 0.976362179487, time: 4.53s, total time: 99.92s
Epoch 18, Training Loss: 1.47282876363, Test accuracy: 0.979967948718, time: 4.61s, total time: 105.53s
Epoch 19, Training Loss: 1.47205684374, Test accuracy: 0.980268429487, time: 4.53s, total time: 110.99s
Total training time: 110.99s
Confusion Matrix:
[[ 972 1 1 0 1 2 8 2 5 3]
[ 0 1123 6 0 0 0 2 2 0 4]
[ 1 4 1006 3 1 2 1 14 5 1]
[ 0 1 2 993 0 5 1 4 4 5]
[ 0 0 2 0 970 0 4 1 1 6]
[ 2 3 0 5 0 877 1 0 2 5]
[ 2 0 3 0 1 2 936 0 1 0]
[ 1 1 6 4 1 1 0 999 4 10]
[ 2 2 6 4 2 3 5 3 951 5]
[ 0 0 0 1 6 0 0 3 1 970]]
Training Complete