1.安装Anaconda
很久之前装的了,忘了,略过
2.安装pystan
打开Anaconda Promopt
(base) C://... >python
>>>conda install pystan
可能是因为已经安装过visual studio c++,pystan的安装挺顺利的,好像就在[Y]/N那里输入了个Y,然后就连续几个done顺利安装了
3.安装fbprophet
>>> conda install -c conda-forge fbprophet
然后有个报错,说是没安装plotly,那就装!
>>> pip install plotly_express
4.检验是否安装成功
>>> import pystan
>>> import fbprophet
没有报错,成功安装
5.导入数据
>>> import pandas as pd
>>> from fbprophet import Prophet
>>> df=pd.read_csv('D:\Machine learning\prophet-main\example_wp_log_peyton_manning.csv')
使用的是在 Prophet 的 github 主页 中的 examples 文件夹 内的数据集
>>> df.head()
ds y
0 2007-12-10 9.590761
1 2007-12-11 8.519590
2 2007-12-12 8.183677
3 2007-12-13 8.072467
4 2007-12-14 7.893572
>>> type(df.ds.iloc[0])
<class 'str'>
查看数据、数据类型
>>> m = Prophet()
>>> m.fit(df)
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
Initial log joint probability = -19.4685
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
99 7974.84 0.00139678 451.299 1 1 127
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
199 7993.33 0.00250392 122.806 1 1 249
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
257 7996.11 6.49542e-005 196.815 3.664e-007 0.001 352 LS failed, Hessian reset
299 7997.14 0.000274336 138.333 0.4632 0.4632 400
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
353 7998.5 8.92333e-005 165.006 1.585e-007 0.001 508 LS failed, Hessian reset
399 7999.92 0.000379877 87.9033 1 1 568
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
499 8001.45 0.000925247 81.3781 1 1 690
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
501 8001.47 8.16398e-005 187.283 7.257e-007 0.001 737 LS failed, Hessian reset
599 8003.19 0.000309044 85.014 0.2 1 865
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
629 8003.45 0.000142042 190.253 1.703e-006 0.001 943 LS failed, Hessian reset
699 8003.94 3.56514e-005 75.4715 1 1 1030
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
730 8003.97 5.09062e-005 160.324 5.924e-007 0.001 1115 LS failed, Hessian reset
772 8003.99 4.1733e-007 59.2228 1 1 1167
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
<fbprophet.forecaster.Prophet object at 0x0000022560BCE820>
训练模型
6.用predict方法预测
>>> future = m.make_future_dataframe(periods=365)
>>> future.tail()
ds
3265 2017-01-15
3266 2017-01-16
3267 2017-01-17
3268 2017-01-18
3269 2017-01-19
建立预测对象future,选定新的数据框(dataframe)。
ds列包含要进行预测的日期,模型为每一行分配一个预测值yhat。预测结果包括预测值列 yhat,以及用于描述不确定性区间的列yhat_lower,yhat_upper。
>>> forecast = m.predict(future)
>>> forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()
ds yhat yhat_lower yhat_upper
3265 2017-01-15 8.213978 7.472093 8.983162
3266 2017-01-16 8.539051 7.812894 9.242272
3267 2017-01-17 8.326490 7.602116 9.047672
3268 2017-01-18 8.159151 7.435610 8.911742
3269 2017-01-19 8.171109 7.451504 8.889298
>>> fig1 = m.plot(forecast)
>>> fig1.show()
fig1-预测结果
如果要查看预测的各组成部分,可以使用 Prophet.plot_components方法。可以得到时间序列的趋势、年度季节性和每周季节性。 如果包括假期,这里也可以看到。
>>> fig2 = m.plot_components(forecast)
>>> fig2.show()
fig2 - 时间序列的趋势、年度季节性和每周季节性
7.增长率预测
在对增长的预测中,使用Logistic增长模型对模型进行训练。而时间序列的增长一般会有饱和位置,记为capability,此处指定承载量为8.5。
>>> df['cap'] = 8.5
>>> m = Prophet(growth='logistic')
>>> m.fit(df)
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
Initial log joint probability = -8.47531
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
99 7621.52 0.00362023 133.187 0.5014 0.5014 117
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
199 7635.43 0.000861862 95.3977 1 1 229
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
213 7636.7 0.000229714 65.7989 1.245e-006 0.001 289 LS failed, Hessian reset
239 7638.34 0.000156188 59.037 1.483e-006 0.001 356 LS failed, Hessian reset
299 7645.01 0.00374056 103.247 1 1 428
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
310 7645.75 0.000203117 77.8656 1.471e-006 0.001 489 LS failed, Hessian reset
337 7648.86 0.00035191 102.822 1.352e-006 0.001 561 LS failed, Hessian reset
399 7652.37 0.0008327 83.6678 0.66 1 649
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
488 7658.97 0.000244669 82.2666 1.374e-006 0.001 805 LS failed, Hessian reset
499 7659.51 2.19799e-005 69.3234 0.002165 0.4166 821
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
557 7661.28 0.000211215 71.7745 1.484e-006 0.001 942 LS failed, Hessian reset
599 7662.73 0.000471138 81.0094 0.8559 0.8559 993
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
651 7663.71 0.000259132 89.8795 2.762e-006 0.001 1094 LS failed, Hessian reset
696 7663.76 2.54313e-006 67.307 4.475e-008 0.001 1197 LS failed, Hessian reset
699 7663.76 2.30729e-006 61.8599 1 1 1200
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
702 7663.76 1.96329e-007 60.4436 0.2355 1 1206
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
<fbprophet.forecaster.Prophet object at 0x0000022560CA5D90>
预测五年内的情况
>>> future = m.make_future_dataframe(periods=1826)
>>> future['cap'] = 8.5
>>> fcst = m.predict(future)
>>> fig3 = m.plot(fcst)
>>> fig3.show()
fig3-设置了上界cap的增长率预测结果
其中,2016-2020是预测结果部分,黑色虚线是承载量。Logistic函数存在饱和最小值(Saturating Minimum)0.进一步地,可以通过设置一个类似于'cap'的变量控制下界/饱和最小值,此处记为‘floor’。
>>> df['y'] = 10 - df['y']
>>> df['cap'] = 6
>>> df['floor'] = 1.5
>>> future['cap'] = 6
>>> future['floor'] = 1.5
>>> m = Prophet(growth='logistic')
>>> m.fit(df)
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
Initial log joint probability = -242.439
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
99 4261.56 0.0192641 329.803 0.2518 1 138
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
199 4267.56 0.000172836 520.788 0.06536 0.06536 278
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
268 4272.79 5.30085e-005 181.799 8.235e-008 0.001 429 LS failed, Hessian reset
299 4275.53 0.00830706 145.187 0.7065 1 466
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
397 4276.89 1.13286e-005 74.3136 1.018e-007 0.001 657 LS failed, Hessian reset
399 4276.9 0.000635838 55.8097 1 1 660
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
439 4277.03 1.64021e-005 83.8871 8.392e-008 0.001 775 LS failed, Hessian reset
459 4277.07 1.11134e-005 80.4945 1.438e-007 0.001 840 LS failed, Hessian reset
497 4277.99 3.53233e-005 174.249 8.222e-008 0.001 937 LS failed, Hessian reset
499 4278.13 0.000474647 84.8754 1 1 941
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
508 4278.37 1.81296e-005 62.6272 7.532e-008 0.001 998 LS failed, Hessian reset
599 4279.08 0.000361055 70.4104 1 1 1136
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
605 4279.09 0.000137192 80.5238 2.417e-006 0.001 1203 LS failed, Hessian reset
632 4279.12 1.12489e-005 77.1426 1.116e-007 0.001 1290 LS failed, Hessian reset
681 4279.14 2.4528e-007 44.2025 0.3853 0.3853 1362
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
<fbprophet.forecaster.Prophet object at 0x0000022564FD0250>
>>> fcst = m.predict(future)
>>> fig4 = m.plot(fcst)
>>> fig4.show()
fig4-设置了上界'cap'跟下界‘floor’的预测结果
语法总结
# 初始操作
>>> import pandas as pd
>>> from fbprophet import Prophet #装Prophet
>>> df=pd.read_csv('D:\Machine learning\prophet-main\example_wp_log_peyton_manning.csv') #读取数据集
# 增长趋势预测(linear,默认情况下)
>>> future = m.make_future_dataframe(periods=365) # 预测未来一年的结果
>>> future.tail()
>>> forecast = m.predict(future)
>>> forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail() #标识每列类型
>>> fig1 = m.plot(forecast) # 画出预测情况
>>> fig1.show()
>>> fig2 = m.plot_components(forecast) # 画出预测值的各个组成部分(趋势、年、月、若有还包括季节)
>>> model.fit(df) # 训练模型
# 增长趋势预测(logistic)
#使用growth="logistic",则需要设置上界cap, logistic默认的最小饱和值是0,当然也可以自己指定
>>> df['cap'] = 8.5
>>> m = Prophet(growth='logistic')
>>> m.fit(df) #使用训练集训练模型
>>> future = m.make_future_dataframe(periods=1826) #设置预测对象、预测区间
>>> future['cap'] = 8.5 #设置预测区间上界
>>> fcst = m.predict(future)
>>> fig3 = m.plot(fcst)
>>> fig3.show()
# 增长趋势预测(logistic) 考虑上界cap、下界floor
>>> df['y'] = 10 - df['y']
>>> df['cap'] = 6
>>> df['floor'] = 1.5
>>> future['cap'] = 6
>>> future['floor'] = 1.5
>>> m = Prophet(growth='logistic')
>>> m.fit(df) #训练模型
>>> future = m.make_future_dataframe(periods=1826) #设置预测对象、预测区间
>>> fcst = m.predict(future) #预测
>>> fig4 = m.plot(fcst)
>>> fig4.show()