图像分类篇系列-1

2024-03-04 22:00:42

图像分类

Intel Image Classification
Image Scene Classification of Multiclass

图像分类系列-1，每一篇都将学习一种图像分类方法，将附完整代码
任务类型：图像六分类

在本篇您将学到：
● 图像分类的基本流程
● 自定义CNN模型进行分类，准确率0.84
● VGG提取特征，并进行分类，准确率0.908
● 模型堆叠，并进行分类，准确率0.918

1.1 数据集概述

数据集内容：训练集和验证集已经分好了，每个文件夹下有六个子文件夹：buildings、forest、glacier、mountain、sea、street，即建筑物、森林、冰川、山、海、街道
路径分别为 …/input/train/buildings/1.png,2.png …

数据集任务概述：这是世界各地自然风光的图像数据。该数据包含大约25,000张大小为150x150的图像，分布在6个类别中。

‘建筑物’-> 0
‘森林’-> 1
‘冰川’-> 2
‘山’-> 3
‘海’-> 4
‘街道’-> 5

训练，测试和预测数据在每个zip文件中分开。大约有14k图像，测试中有3k，预测中有7k。该数据最初由英特尔在https://datahack.analyticsvidhya.com上发布，以举办图像分类挑战赛。

综上，该数据集还是很美的，是有一定意义的“好”数据集。

1.2 载入库并读入数据

载入所需库

注：本地实现时若缺少cv2库，请 pip install cv2，这会报错，应改为
pip install opencv-python

若保存模型图片还需pip install pydot和安装graphviz-2.38.msi，纯实现比赛不需要这些

import tensorflow.keras.layers as Layers
import tensorflow.keras.activations as Actications
import tensorflow.keras.models as Models
import tensorflow.keras.optimizers as Optimizer
import tensorflow.keras.metrics as Metrics
import tensorflow.keras.utils as Utils
from keras.utils.vis_utils import model_to_dot
import os
import matplotlib.pyplot as plot
import cv2
import numpy as np
from sklearn.utils import shuffle
from sklearn.metrics import confusion_matrix as CM
from random import randint
from IPython.display import SVG
import matplotlib.gridspec as gridspec

读入数据

label2int = {'buildings':0,'forest':1,'glacier':2,'mountain':3,'sea':4,'street':5}
int2label = dict([val,key] for key,val in label2int.items())

def get_images(directory):
    Images = []
    Labels = [] 

    for labels in os.listdir(directory):
        label = label2int[labels]
        
        for image_file in os.listdir(directory+labels): #Extracting the file name of the image from Class Label folder
            image = cv2.imread(directory+labels+r'/'+image_file) # Reading the image (OpenCV)
            image = cv2.resize(image,(150,150)) #Resize the image, Some images are different sizes. (Resizing is very Important)
            Images.append(image)
            Labels.append(label)
    
    return shuffle(Images,Labels,random_state=817328462) #Shuffle the dataset you just prepared.

def get_classlabel(class_code):
    labels = {2:'glacier', 4:'sea', 0:'buildings', 1:'forest', 5:'street', 3:'mountain'}
    
    return labels[class_code]

Images, Labels = get_images('../input/seg_train/') #Extract the training images from the folders.

Images = np.array(Images) #converting the list of images to numpy array.
Labels = np.array(Labels)

查看数据

Lets find shape of our traing data.As you see, The Training data is in shape of (Number of Training Images, Width of image, Height of image, Channel of image). This shape is very important. If you didnot resize the images to same size. It should be (No. of images,) shape. So, using this shape you cant feed the images to the model.

让我们看下训练数据的形状：训练图像的数目，图像的宽度，图像的高度，图像的通道。这个形状很重要。如果你没有调整图像的大小。它应该是（图像数量，）形状，这样的此形状不能将图像喂给模型训练。

print("Shape of Images:",Images.shape)
print("Shape of Labels:",Labels.shape)

数据形状意义为 图片数目+图片宽度+图片高度+图片channel数（RGB）

随机取25个数据看下

f,ax = plot.subplots(5,5) 
f.subplots_adjust(0,0,3,3)
for i in range(0,5,1):
    for j in range(0,5,1):
        rnd_number = randint(0,len(Images))
        ax[i,j].imshow(Images[rnd_number])
        ax[i,j].set_title(get_classlabel(Labels[rnd_number]))
        ax[i,j].axis('off')

II 模型1 CNN（ACC：0.844）

附参考的开源notebook

2.1 CNN

model = Models.Sequential()
model.add(Layers.Conv2D(200,kernel_size=(3,3),activation='relu',input_shape=(150,150,3)))
model.add(Layers.Conv2D(180,kernel_size=(3,3),activation='relu'))
model.add(Layers.MaxPool2D(5,5))
model.add(Layers.Conv2D(180,kernel_size=(3,3),activation='relu'))
model.add(Layers.Conv2D(140,kernel_size=(3,3),activation='relu'))
model.add(Layers.Conv2D(100,kernel_size=(3,3),activation='relu'))
model.add(Layers.Conv2D(50,kernel_size=(3,3),activation='relu'))
model.add(Layers.MaxPool2D(5,5))
model.add(Layers.Flatten())
model.add(Layers.Dense(180,activation='relu'))
model.add(Layers.Dense(100,activation='relu'))
model.add(Layers.Dense(50,activation='relu'))
model.add(Layers.Dropout(rate=0.5))
model.add(Layers.Dense(6,activation='softmax'))

model.compile(optimizer=Optimizer.Adam(lr=0.0001),loss='sparse_categorical_crossentropy',metrics=['accuracy'])

model.summary()

# 模型保存成图片
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Utils.plot_model(model,to_file='model.png',show_shapes=True)

让我们先来看一下模型

2.2 训练

训练时间较长、可以考虑将epochs从35改为更小的值

trained = model.fit(Images,Labels,epochs=35,validation_split=0.30)

训练效果图

plot.plot(trained.history['acc'])
plot.plot(trained.history['val_acc'])
plot.title('Model accuracy')
plot.ylabel('Accuracy')
plot.xlabel('Epoch')
plot.legend(['Train', 'Test'], loc='upper left')
plot.show()

plot.plot(trained.history['loss'])
plot.plot(trained.history['val_loss'])
plot.title('Model loss')
plot.ylabel('Loss')
plot.xlabel('Epoch')
plot.legend(['Train', 'Test'], loc='upper left')
plot.show()

2.3 训练效果

测试集效果

test_images,test_labels = get_images('../input/seg_test/seg_test/')
test_images = np.array(test_images)
test_labels = np.array(test_labels)
model.evaluate(test_images,test_labels, verbose=1)

可以得到准确率为0.8441140143076579

载入验证集

pred_images,no_labels = get_images('../input/seg_pred/')
pred_images = np.array(pred_images)

验证模型预测结果

fig = plot.figure(figsize=(30, 30))
outer = gridspec.GridSpec(5, 5, wspace=0.2, hspace=0.2)

for i in range(25):
    inner = gridspec.GridSpecFromSubplotSpec(2, 1,subplot_spec=outer[i], wspace=0.1, hspace=0.1)
    rnd_number = randint(0,len(pred_images))
    pred_image = np.array([pred_images[rnd_number]])
    pred_class = get_classlabel(model.predict_classes(pred_image)[0])
    pred_prob = model.predict(pred_image).reshape(6)
    for j in range(2):
        if (j%2) == 0:
            ax = plot.Subplot(fig, inner[j])
            ax.imshow(pred_image[0])
            ax.set_title(pred_class)
            ax.set_xticks([])
            ax.set_yticks([])
            fig.add_subplot(ax)
        else:
            ax = plot.Subplot(fig, inner[j])
            ax.bar([0,1,2,3,4,5],pred_prob)
            fig.add_subplot(ax)


fig.show()

III 模型2 VGG （0.908-0.918）

附参考的开源notebook
载入经过ImageNet训练得到的VGG16

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input

model = VGG16(weights='imagenet', include_top=False)

3.1提取特征

train_features = model.predict(train_images)
test_features = model.predict(test_images)

3.2 PCA可视化特征

n_train, x, y, z = train_features.shape
n_test, x, y, z = test_features.shape
numFeatures = x * y * z

from sklearn import decomposition

pca = decomposition.PCA(n_components = 2)

X = train_features.reshape((n_train, x*y*z))
pca.fit(X)

C = pca.transform(X) # Représentation des individus dans les nouveaux axe
C1 = C[:,0]
C2 = C[:,1]

plt.subplots(figsize=(12,12))
class_names = ['mountain', 'street', 'glacier', 'buildings', 'sea', 'forest']
for i, class_name in enumerate(class_names):
    plt.scatter(C1[train_labels == i][:1000], C2[train_labels == i][:1000], label = class_name, alpha=0.4)
plt.legend()
plt.title("PCA Projection")
plt.show()

3.3 定义模型 DNN

model2 = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape = (x, y, z)),
    tf.keras.layers.Dense(180, activation=tf.nn.relu),
    tf.keras.layers.Dense(140, activation=tf.nn.relu),
    tf.keras.layers.Dense(120, activation=tf.nn.relu),
    Layers.Dropout(rate=0.5),
    tf.keras.layers.Dense(50, activation=tf.nn.relu),
    Layers.Dropout(rate=0.5),
    tf.keras.layers.Dense(6, activation=tf.nn.softmax)
])

model2.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])

history2 = model2.fit(train_features, train_labels, batch_size=128, epochs=15, validation_split = 0.2)

模型效果 acc: 0.9083

test_loss = model2.evaluate(test_features, test_labels)

3.4 模型堆叠

定义基本参数

np.random.seed(seed=1997)
# Number of estimators
n_estimators = 10
# Proporition of samples to use to train each training
max_samples = 0.8

max_samples *= n_train
max_samples = int(max_samples)

构造模型列表

models = list()
random = np.random.randint(50, 100, size = n_estimators)

for i in range(n_estimators):
    
    # Model
    model = tf.keras.Sequential([ tf.keras.layers.Flatten(input_shape = (x, y, z)),
                                # One layer with random size
                                    tf.keras.layers.Dense(random[i], activation=tf.nn.relu),
                                    tf.keras.layers.Dense(6, activation=tf.nn.softmax)
                                ])
    
    model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
    
    # Store model
    models.append(model)

训练

histories = []

for i in range(n_estimators):
    # Train each model on a bag of the training data
    train_idx = np.random.choice(len(train_features), size = max_samples)
    histories.append(models[i].fit(train_features[train_idx], train_labels[train_idx], batch_size=128, epochs=10, validation_split = 0.1))

模型效果

predictions = []
for i in range(n_estimators):
    predictions.append(models[i].predict(test_features))
    
predictions = np.array(predictions)
predictions = predictions.sum(axis = 0)
pred_labels = predictions.argmax(axis=1)

from sklearn.metrics import accuracy_score
print("Accuracy : {}".format(accuracy_score(test_labels, pred_labels)))

Accuracy : 0.9186666666666666

码农公寓

图像分类篇系列-1

目录

1.1 数据集概述

1.2 载入库并读入数据

2.1 CNN

2.2 训练

2.3 训练效果

3.1提取特征

3.2 PCA可视化特征

3.3 定义模型 DNN

3.4 模型堆叠

码农公寓

目录

1.1 数据集概述

1.2 载入库并读入数据

2.1 CNN

2.2 训练

2.3 训练效果

3.1提取特征

3.2 PCA可视化特征

3.3 定义模型 DNN

3.4 模型堆叠

相关文章