使用简单的Kmeans方法对特征数与样本数较多的电离层雷达回波数据进行分类,并计算分类精度,检验效果。
其中,通过PCA方法对数据降维实现可视化。
本文代码通过MATLAB2020a编写。
%% 清除工作区、命令行窗口,关闭图像窗口
clc
clear
close all
%% 加载电离层雷达回波分类数据 数据来自UCI http://archive.ics.uci.edu/ml/datasets/Ionosphere
% Ionosphere dataset from the UCI machine learning repository: '
% http://archive.ics.uci.edu/ml/datasets/Ionosphere '
% X is a 351x34 real-valued matrix of predictors. Y is a categorical response: '
% "b" for bad radar returns and "g" for good radar returns. '
% This is a binary classification problem. '
%% 读取数据集
ionosphere = readtable('ionosphere.data','Filetype','text','ReadVariableNames',false);
data = table2array(ionosphere(:,1:34));
% 将标签从字符串类型转换为数值类型
label_str = table2array(ionosphere(:,end));
label = zeros(height(ionosphere),1);
for i=1:length(label_str)
if label_str{i} == 'g'
label(i,1) = 1;
else
label(i,1) = 2;
end
end
%% K-means聚类分析
disp("Kmeans迭代计算信息:")
[idx,C] = kmeans(data,2,'Display','iter', 'Distance','sqeuclidean','Start','plus'); %
%% 绘制轮廓图
figure;
silhouette(data, idx,'sqEuclidean');
title('轮廓图')
%% 数据降维 主成分分析
coeff = pca(data', 'NumComponents',2); % 将数据特征降为二维方便可视化
%% 绘图
figure;
gscatter(coeff(:,1),coeff(:,2),idx);
title('kmeans聚类结果')
figure;
gscatter(coeff(:,1),coeff(:,2),label);
title('正确的分类')
%% 计算准确率
count = 0;
for i=1:length(idx)
if idx(i) ~= label(i)
count = count + 1;
end
end
acc = 1 - count/length(idx);
disp("Kmeans聚类的准确率为:")
disp(acc)