在python中插值/外推丢失的日期?

可以说我有以下数据框

bb = pd.DataFrame(data = {'date' :['','','','2015-09-02', '2015-09-02', '2015-09-03','','2015-09-08', '', '2015-09-11','2015-09-14','','' ]})     
bb['date'] = pd.to_datetime(bb['date'], format="%Y-%m-%d")     

我想线性内插和外推以填充缺少的日期值.我使用了以下代码,但它没有任何改变.我是熊猫新手.请帮忙

bb= bb.interpolate(method='time')

解决方法:

要进行推断,您必须使用bfill()和ffill().缺少的值将由后(或前)值分配.

要进行线性插值,您必须使用函数插值,但是日期需要转换为数字:

import numpy as np
import pandas as pd
from datetime import datetime

bb = pd.DataFrame(data = {'date' :['','','','2015-09-02', '2015-09-02', '2015-09-03','','2015-09-08', '', '2015-09-11','2015-09-14','','' ]})     
bb['date'] = pd.to_datetime(bb['date'], format="%Y-%m-%d")     

# convert to seconds
tmp = bb['date'].apply(lambda t: (t-datetime(1970,1,1)).total_seconds())
# linear interpolation
tmp.interpolate(inplace=True)    
# back convert to dates
bb['date'] = pd.to_datetime(tmp, unit='s') 
bb['date'] = bb['date'].apply(lambda t: t.date())
# extrapolation for the first missing values
bb.bfill(inplace='True')

print bb

结果:

         date
0  2015-09-02
1  2015-09-02
2  2015-09-02
3  2015-09-02
4  2015-09-02
5  2015-09-03
6  2015-09-05
7  2015-09-08
8  2015-09-09
9  2015-09-11
10 2015-09-14
11 2015-09-14
12 2015-09-14
上一篇:可以使用scipy将连续随机变量转换为离散变量吗?


下一篇:如何更快地读取/遍历/切片Scipy稀疏矩阵(LIL,CSR,COO,DOK)?