我正在使用python计算两个事件之间的时间间隔.每个事件都有一个“开始时间”和“结束时间”.我在新列“时间间隔”中发现了两者之间的区别,但是当开始和结束时间在不同的日期时(例如,开始时间23:46:00和结束时间00:21:00给出了-23) :25:00).我想创建一个if语句来运行“时间间隔”列,并将24小时添加到任何负值中.但是,我在将“间隔”值增加24小时时遇到了问题.目前,我的“间隔” dtype = timedelta64 [ns].
这是表格的一部分,以澄清问题:
CallDate BeginningTime EndingTime Interval
75 1/8/2009 1900-01-01 07:49:00 1900-01-01 08:19:00 00:30:00
76 1/11/2009 1900-01-01 14:37:00 1900-01-01 14:59:00 00:22:00
77 1/9/2009 1900-01-01 09:29:00 1900-01-01 09:56:00 00:27:00
78 1/11/2009 1900-01-01 09:20:00 1900-01-01 10:13:00 00:53:00
79 1/16/2009 1900-01-01 15:11:00 1900-01-01 15:50:00 00:39:00
80 1/17/2009 1900-01-01 22:52:00 1900-01-01 23:26:00 00:34:00
81 1/19/2009 1900-01-01 05:48:00 1900-01-01 06:32:00 00:44:00
82 1/20/2009 1900-01-01 23:46:00 1900-01-01 00:21:00 -23:25:00
83 1/20/2009 1900-01-01 21:29:00 1900-01-01 22:08:00 00:39:00
84 1/23/2009 1900-01-01 07:33:00 1900-01-01 07:55:00 00:22:00
85 1/30/2009 1900-01-01 19:33:00 1900-01-01 20:01:00 00:28:00
更新:这是导致我到这一点的代码
df['BeginningTime']=pd.to_datetime(df['BeginningTime'], format='%H:%M')
df['EndingTime']=pd.to_datetime(df['EndingTime'], format='%H:%M')
df['Interval']=df['EndingTime']-df['BeginningTime']
df[['CallDate','BeginningTime','EndingTime','Interval']]
解决方法:
如果您只想将1天作为负数添加到时间增量中,请执行以下操作:
df['Interval']=df['Interval'].apply(lambda x: x + Timedelta(days=1) if x < 0 else x)
如果可以确定结束时间将在24小时之内,则可以检查结束时间是否早于开始时间,并使用timedelta将结束时间增加一天,而不是间隔时间.
from datetime import datetime, timedelta
d1 = datetime.strptime( "1900-01-01 23:46:00", "%Y-%m-%d %H:%M:%S" )
d2 = datetime.strptime( "1900-01-01 00:21:00", "%Y-%m-%d %H:%M:%S" )
if d2 < d1:
d2 += timedelta(days=1)
print d2 - d1
# 0:35:00
使用熊猫,您可以执行以下操作:
import pandas as pd
from pandas import Timedelta
d = {
"CallDate": [
"1/8/2009",
"1/11/2009",
"1/9/2009",
"1/11/2009",
"1/16/2009",
"1/17/2009",
"1/19/2009",
"1/20/2009",
"1/20/2009",
"1/23/2009",
"1/30/2009"
],
"BeginningTime": [
"1900-01-01 07:49:00",
"1900-01-01 14:37:00",
"1900-01-01 09:29:00",
"1900-01-01 09:20:00",
"1900-01-01 15:11:00",
"1900-01-01 22:52:00",
"1900-01-01 05:48:00",
"1900-01-01 23:46:00",
"1900-01-01 21:29:00",
"1900-01-01 07:33:00",
"1900-01-01 19:33:00"
],
"EndingTime": [
"1900-01-01 08:19:00",
"1900-01-01 14:59:00",
"1900-01-01 09:56:00",
"1900-01-01 10:13:00",
"1900-01-01 15:50:00",
"1900-01-01 23:26:00",
"1900-01-01 06:32:00",
"1900-01-01 00:21:00",
"1900-01-01 22:08:00",
"1900-01-01 07:55:00",
"1900-01-01 20:01:00"
]
}
df = pd.DataFrame(data=d)
df['BeginningTime']=pd.to_datetime(df['BeginningTime'], format="%Y-%m-%d %H:%M:%S")
df['EndingTime']=pd.to_datetime(df['EndingTime'], format="%Y-%m-%d %H:%M:%S")
def interval(x):
if x['EndingTime'] < x['BeginningTime']:
x['EndingTime'] += Timedelta(days=1)
return x['EndingTime'] - x['BeginningTime']
df['Interval'] = df.apply(interval, axis=1)
In [2]: df
Out[2]:
BeginningTime CallDate EndingTime Interval
0 1900-01-01 07:49:00 1/8/2009 1900-01-01 08:19:00 00:30:00
1 1900-01-01 14:37:00 1/11/2009 1900-01-01 14:59:00 00:22:00
2 1900-01-01 09:29:00 1/9/2009 1900-01-01 09:56:00 00:27:00
3 1900-01-01 09:20:00 1/11/2009 1900-01-01 10:13:00 00:53:00
4 1900-01-01 15:11:00 1/16/2009 1900-01-01 15:50:00 00:39:00
5 1900-01-01 22:52:00 1/17/2009 1900-01-01 23:26:00 00:34:00
6 1900-01-01 05:48:00 1/19/2009 1900-01-01 06:32:00 00:44:00
7 1900-01-01 23:46:00 1/20/2009 1900-01-01 00:21:00 00:35:00
8 1900-01-01 21:29:00 1/20/2009 1900-01-01 22:08:00 00:39:00
9 1900-01-01 07:33:00 1/23/2009 1900-01-01 07:55:00 00:22:00
10 1900-01-01 19:33:00 1/30/2009 1900-01-01 20:01:00 00:28:00