1. 漏定位的面单样本训练+测试统计
- 人工标注面单最佳分割阈值表
- 线性回归模型
表1-2 注:
1)100个人工标注的样本训练,每个样本包含4个特征(依次:原样本灰度图的灰度均值和全局Otsu值,强光修复后灰度图像的灰度均值及全局Otsu值);
2)测试的数据是从训练样本中随机挑选的5组样本,目的是观测线性回归模型的训练拟合效果;
(这里没有从训练样本之外的数据中挑选样本,来测试验证其泛化性能)
3)从表中可以看出,各种拟合方式的结果相差不大,但是与标签(实际值)有一定差距,说明训练模型欠拟合,有待改进或提高;
解决方案: a. 改进回归模型拟合的方式,比如:提供拟合方程为高次幂(见表1-3分析);
b. 扩大训练的样本规模(见表1-4分析);
c. 改进样本的训练特征;
表1-3 注:
1)上述训练的样本仍是100组;
2)观测“训练集”下的测试样本,当N=5时,线性回归模型的测试结果与标签基本保持一致;但是,此时若用“测试集”下的样本,来测试拟合的模型,其结果与实际标签相差甚远,说明模型已经过拟合了;
3)训练数据集及测试代码如下.
Train_data.csv
49.986721,50,48.862217,49,34 23.480339,63,22.087894,23,45 47.97068,121,38.859943,44,74 61.894985,117,48.692921,56,90 52.253571,61,49.11853,55,73 19.852621,74,19.222271,72,72 20.845728,62,19.947214,57,56 44.538506,115,37.98954,43,43 13.515056,31,12.830379,28,55 23.994682,69,21.488462,59,85 35.872845,79,32.430157,33,44 29.536407,72,28.510805,64,90 45.398739,128,39.323338,40,35 29.679146,34,27.898521,31,62 84.080841,87,77.693764,74,73 46.169239,134,42.523769,47,38 39.475365,93,34.886265,49,49 59.247337,60,57.82312,58,16 30.091749,92,26.634928,67,68 21.239315,64,20.726948,61,61 32.904915,124,30.729103,34,15 70.833916,64,68.500946,61,54 42.092319,97,38.191372,71,71 16.85672,21,16.744583,21,35 39.519398,99,31.689426,43,52 19.60574,30,17.920612,25,46 30.031826,92,26.656433,67,67 26.915159,33,24.503157,28,20 21.294516,82,19.798532,74,74 39.678482,121,34.299995,47,38 69.679398,68,66.585281,64,35 33.415878,99,27.269951,29,42 37.286701,121,27.919937,34,68 22.043592,51,21.266073,48,48 48.50737,133,37.857922,44,46 34.138905,75,34.10067,74,74 37.146286,56,32.108444,43,50 30.016088,51,28.376884,34,34 36.044006,80,30.333361,50,38 71.459511,122,55.584164,58,45 22.01244,84,20.370054,74,74 53.094749,135,40.457184,46,56 36.035366,110,30.367359,39,45 35.164429,74,35.164429,74,74 31.496492,34,30.231068,32,55 34.097385,36,32.628937,33,50 26.203951,28,25.222651,26,16 52.870003,57,48.843529,52,110 71.513496,123,55.356491,58,48 34.929268,107,29.545202,39,50 20.235893,62,19.217731,27,60 23.946028,58,23.30785,35,40 19.356678,58,18.866066,54,54 35.159679,41,35.159081,41,69 38.879154,131,31.500521,37,37 34.023487,82,32.900944,77,77 30.402153,71,28.825174,66,66 36.666336,100,33.766953,37,81 31.849125,96,27.252998,35,57 14.701057,66,13.695985,59,58 24.746655,37,22.891354,30,35 15.988938,57,15.686599,55,56 64.210938,124,49.422485,53,58 38.794212,113,33.314041,39,74 13.345396,57,13.345396,57,56 41.711437,124,31.613926,37,110 38.293716,79,37.682972,76,76 28.073799,75,24.945026,34,46 24.975405,38,22.897106,29,40 36.917492,116,29.782568,39,56 56.393402,69,56.080254,68,44 49.489582,71,46.603695,64,26 11.233971,51,10.257475,26,45 75.045959,82,69.606705,72,72 57.210033,65,55.995747,62,46 47.267517,117,36.350822,44,95 13.978129,62,13.61613,59,58 19.296797,54,19.054842,53,53 13.23297,60,13.115759,59,58 37.072403,116,30.066593,40,55 22.710648,37,21.312677,34,34 77.6241,76,73.901886,71,35 68.972183,80,59.639511,65,115 14.73587,64,14.347698,62,62 27.683052,82,26.073189,73,14 8.845615,23,8.736045,22,40 50.006321,120,38.885624,48,45 61.246403,67,59.450649,63,72 27.532454,88,24.988146,74,74 46.767673,115,36.288193,46,105 59.823963,66,58.650486,64,25 38.470818,117,31.213203,41,60 53.724159,108,46.329811,48,55 13.401003,61,13.132825,60,58 46.456448,126,38.748985,43,43 82.047318,78,78.135925,73,50 34.080803,94,28.746769,40,65 26.933811,79,25.688793,75,75 30.923611,66,29.268738,38,60 34.891171,36,33.256649,34,40View Code
MLP_PolyNormialFeatures.py
1 import matplotlib.pyplot as plt 2 import numpy as np 3 from numpy import genfromtxt 4 from sklearn import linear_model 5 from sklearn.preprocessing import PolynomialFeatures 6 7 dataPath = r"delivery_analyze.csv" 8 deliveryData = genfromtxt(dataPath, delimiter=',') 9 10 print("data") 11 print(deliveryData) 12 13 X = deliveryData[:, :-1] 14 Y = deliveryData[:, -1] 15 16 # print("X:") 17 # print(X) 18 # print("Y: ") 19 # print(Y.shape) 20 21 poly_reg = PolynomialFeatures(degree=2) #degree=2表示二次多项式 22 X_poly = poly_reg.fit_transform(X) 23 lin_reg_2 = linear_model.LinearRegression() 24 lin_reg_2.fit(X_poly, Y) 25 26 27 # # 查看回归系数 28 # print('Coefficients:', lin_reg_2.coef_) 29 # # 查看截距项 30 # print('intercept:', lin_reg_2.intercept_) 31 32 test_datas = [[82.047318,78,78.135925,73], 33 [34.080803,94,28.746769,40], 34 [26.933811,79,25.688793,75], 35 [30.923611,66,29.268738,38], 36 [34.891171,36,33.256649,34], 37 [51.101326,80,44.875404,70], 38 [35.011463,91,32.894814,78], 39 [68.176659,69,67.649818,68], 40 [26.362432,79,26.302559,78], 41 [25.918451,85,24.443407,74]] 42 43 for data in test_datas: 44 X_pred = data 45 X_pred = np.array(X_pred).reshape(1, -1) 46 Y_pred = lin_reg_2.predict(poly_reg.fit_transform(X_pred)) 47 48 print(Y_pred)View Code
表1-4 注:
1)
样本编号 |
Src_gray |
人工标注最佳分割阈值 |
Src_gray_inpaint |
人工标注最佳分割阈值 |
||
Mean |
Otsu |
Mean |
Otsu |
|||
7 |
44 |
115 |
43 |
37.98 |
43 |
43 |
10 |
35.87 |
79 |
44 |
32.43 |
33 |
44 |
16 |
39.47 |
93 |
93 |
34.88 |
49 |
49 |
20 |
32.90 |
123 |
14 |
30.72 |
34 |
15 |
22 |
42.09 |
97 |
70 |
38.19 |
71 |
71 |
27 |
26.91 |
33 |
20 |
24.50 |
28 |
20 |
29 |
39.67 |
121 |
38 |
34.29 |
47 |
38 |
30 |
69.67 |
68 |
32 |
66.58 |
64 |
35 |
31 |
33.41 |
99 |
41 |
27.26 |
29 |
42 |
32 |
37.28 |
121 |
68 |
27.91 |
34 |
68 |
34 |
48.50 |
133 |
46 |
37.85 |
44 |
46 |
36 |
37.14 |
56 |
56 |
32.10 |
42 |
50 |
37 |
30.01 |
50 |
35 |
28.37 |
34 |
34 |
38 |
36.04 |
80 |
37 |
30.33 |
50 |
38 |
39 |
71.45 |
122 |
44 |
55.58 |
58 |
45 |
41 |
43.09 |
135 |
55 |
40.57 |
45 |
56 |
42 |
36.03 |
110 |
56 |
30.36 |
39 |
45 |
46 |
26.20 |
28 |
16 |
25.22 |
26 |
16 |
48 |
71.51 |
123 |
47 |
55.35 |
58 |
48 |
49 |
34.92 |
107 |
43 |
29.54 |
39 |
50 |
50 |
20.23 |
62 |
62 |
19.21 |
27 |
60 |
51 |
23.94 |
58 |
40 |
23.30 |
35 |
40 |
53 |
35.16 |
41 |
68 |
35.15 |
41 |
69 |
55 |
34.02 |
82 |
82 |
32.90 |
77 |
77 |