R_Studio(神经网络)BP神经网络算法预测销量的高低

  BP神经网络  百度百科:传送门

  BP(back propagation)神经网络:一种按照误差逆向传播算法训练的多层前馈神经网络,是目前应用最广泛的神经网络

  R_Studio(神经网络)BP神经网络算法预测销量的高低

#设置文件工作区间
setwd('D:\\dat')
#读入数据
Gary=read.csv("sales_data.csv")[,2:5]
#数据命名
library(nnet)
colnames(Gary)<-c("x1","x2","x3","y") ###最终模型
model1=nnet(y~.,data=Gary,size=6,decay=5e-4,maxit=1000) pred=predict(model1,Gary[,1:3],type="class")
(P=sum(as.numeric(pred==Gary$y))/nrow(Gary))
table(Gary$y,pred)
prop.table(table(Gary$y,pred),1)

Gary.Script

实现过程

  目的:通过BP神经网络预测销量的高低

  数据预处理,对数据进行重命名并去除无关项

> #设置文件工作区间
> setwd('D:\\dat')
> #读入数据
> Gary=read.csv("sales_data.csv")[,2:5]
> #数据命名
> library(nnet)
> colnames(Gary)<-c("x1","x2","x3","y")
> Gary
x1 x2 x3 y
1 bad yes yes high
2 bad yes yes high
3 bad yes yes high
4 bad no yes high
5 bad yes yes high
6 bad no yes high
7 bad yes no high
8 good yes yes high
9 good yes no high
10 good yes yes high
11 good yes yes high
12 good yes yes high
13 good yes yes high
14 bad yes yes low
15 good no yes high
16 good no yes high
17 good no yes high
18 good no yes high
19 good no no high
20 bad no no low
21 bad no yes low
22 bad no yes low
23 bad no yes low
24 bad no no low
25 bad yes no low
26 good no yes low
27 good no yes low
28 bad no no low
29 bad no no low
30 good no no low
31 bad yes no low
32 good no yes low
33 good no no low
34 good no no low

  nnet:包实现了前馈神经网络和多项对数线性模型。前馈神经网络是一种常用的神经网络结构,如下图所示

  R_Studio(神经网络)BP神经网络算法预测销量的高低

  前馈网络中各个神经元按接受信息的先后分为不同的组。每一组可以看作一个神经层。每一层中的神经元接受前一层神经元的输出,并输出到下一层神经元。整个网络中的信息是朝一个方向传播,没有反向的信息传播。前馈网络可以用一个有向无环路图表示。前馈网络可以看作一个函数,通过简单非线性函数的多次复合,实现输入空间到输出空间的复杂映射。这种网络结构简单,易于实现。前馈网络包括全连接前馈网络和卷积神经网络等

使用neet()方法创建模型

  neet()方法:

  decay: 权重衰减parameter for weight decay. Default 0.

  maxit: 最大迭代次数

x: 训练样本数据集的输入集合
y: x对应的训练样本数据集的标签(类)集合
weights:
size: 隐层节点数, Can be zero if there are skip-layer units.
data:训练数据集.
subset: An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
na.action: A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)
contrasts:a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.
Wts: 边的权重. If missing chosen at random.
mask: logical vector indicating which parameters should be optimized (default all).
linout: 是否为逻辑输出单元,若为FALSE,则为线性输出单元
entropy: switch for entropy (= maximum conditional likelihood) fitting. Default by least-squares.
softmax: switch for softmax (log-linear model) and maximum conditional likelihood fitting. linout, entropy, softmax and censored are mutually exclusive.
censored: A variant on softmax, in which non-zero targets mean possible classes. Thus for softmax a row of (0, 1, 1) means one example each of classes 2 and 3, but for censored it means one example whose class is only known to be 2 or 3.
skip: switch to add skip-layer connections from input to output.
rang: Initial random weights on [-rang, rang]. Value about 0.5 unless the inputs are large, in which case it should be chosen so that rang * max(|x|) is about 1.
decay: 权重衰减parameter for weight decay. Default 0.
maxit: 最大迭代次数
Hess: If true, the Hessian of the measure of fit at the best set of weights found is returned as component Hessian.
trace: switch for tracing optimization. Default TRUE.
MaxNWts: The maximum allowable number of weights. There is no intrinsic limit in the code, but increasing MaxNWts will probably allow fits that are very slow and time-consuming.
abstol: Stop if the fit criterion falls below abstol, indicating an essentially perfect fit.
reltol: Stop if the optimizer is unable to reduce the fit criterion by a factor of at least 1 - reltol.

neet()方法

model1=nnet(y~.,data=Gary,size=6,decay=5e-4,maxit=1000)
# weights: 31
initial value 27.073547
iter 10 value 16.080731
iter 20 value 15.038060
iter 30 value 14.937127
iter 40 value 14.917485
iter 50 value 14.911531
iter 60 value 14.908678
iter 70 value 14.907836
iter 80 value 14.905234
iter 90 value 14.904499
iter 100 value 14.904028
iter 110 value 14.903688
iter 120 value 14.903480
iter 130 value 14.903450
iter 130 value 14.903450
iter 130 value 14.903450
final value 14.903450
converged

评估模型

> pred=predict(model1,Gary[,1:3],type="class")
> (P=sum(as.numeric(pred==Gary$y))/nrow(Gary))
[1] 0.7647059
> table(Gary$y,pred)
pred
high low
high 14 4
low 4 12
> prop.table(table(Gary$y,pred),1)
pred
high low
high 0.7777778 0.2222222
low 0.2500000 0.7500000

  得到混淆矩阵图后可以看出

  检测样本为34个,预测正确的个数为26个,预测准确率为76.5%,预测准确率较低

  原因:由于神经网络训练时需要较多的样本,而这里训练集比较少 

上一篇:Java虚拟机垃圾回收:内存分配与回收策略 方法区垃圾回收 以及 JVM垃圾回收的调优方法


下一篇:秒懂神经网络---BP神经网络具体应用不能说的秘密.