1 回答
TA贡献1752条经验 获得超4个赞
您需要读入数据并转换为日期时间格式 - 我用剪贴板读入数据并在那里解析日期。其次,您需要按键对数据进行排序(在这种情况下,键是 df1 的“连接”和 df2 的“开始”)。在那之后 pandas merge_asof就足够了。请注意,合并只能在一个键上发生,而不是多个:
对数据框进行排序
df1 = df1.sort_values(['Connect','Ended'])
df2 = df2.sort_values(['Start','End'])
合并数据框
merger = pd.merge_asof(df1,df2,
left_on='Connect',
right_on='Start',
tolerance = pd.Timedelta('20s'),
direction='forward')
merger
Connect Ended Start End
0 2020-03-31 11:00:08 2020-03-31 11:00:10 2020-03-31 11:00:10 2020-03-31 11:00:14
1 2020-04-01 22:00:05 2020-04-01 12:00:05 NaT NaT
2 2020-04-06 13:15:21 2020-04-06 14:05:18 2020-04-06 13:15:21 2020-04-06 14:05:18
应该很容易选择匹配和不匹配的行:
matched = merger.dropna()
matched
Connect Ended Start End
0 2020-03-31 11:00:08 2020-03-31 11:00:10 2020-03-31 11:00:10 2020-03-31 11:00:14
2 2020-04-06 13:15:21 2020-04-06 14:05:18 2020-04-06 13:15:21 2020-04-06 14:05:18
unmatched = merger.loc[merger.isna().any(axis=1)]
unmatched
Connect Ended Start End
1 2020-04-01 22:00:05 2020-04-01 12:00:05 NaT NaT
希望它就足够了......如果你被踩到,文档有更多的例子来指导你
添加回答
举报
