有没有办法加入下面提到的两个数据表在python

表 1 有 800 000 个条目 End_time DAY Exceed C_time stn max start_time2019-12-26 12:29:34 PROD -41.9 21.1 501 21.1 2019-12-26 12:29:13 2019-12-26 12:30:59 PROD -10.3 52.7 501 52.7 2019-12-26 12:30:07 2019-12-26 12:32:36 PROD -35.8 27.2 503 27.2 2019-12-26 12:32:09 2019-12-26 12:33:54 PROD -53.3 9.7 504 9.7 2019-12-26 12:33:45 2019-12-26 12:35:04 PROD -24.6 38.4 505 38.4 2019-12-26 12:34:26 表 2 有 300 000 个条目AlarmMessage D_time Priority Station EquipID Active Quality LineName AlarmInTimeStampS501LH_B_RR_BT 2 1 501 2200505 True 192 BC1 2019-12-26 12:29:16.5608495 SHT_B_S503_BEAM 21 1 503 2300249 True 192 BC1 2019-12-26 12:32:20.0634165 S503LH_B_RR_T 2 1 503 2200505 True 192 BC1 2019-12-26 12:32:25.6494806 SHT_B_S504_ 21 1 504 2300256 True 192 BC1 2019-12-26 12:33:50.6719676 如果表2“AlarmInTimeStamp”位于表1的“start_time”和“End_time”之间，并且两个表“站”是相同的，那么它们应该合并，以便我最终可以计算出在时间戳和D_time之和期间生成的警报数量输出是这样的 End_time DAY Exceed C_time stn max start_time AlarmMessage D_time2019-12-26 12:29:34 PROD -41.9 21.1 501 21.1 2019-12-26 12:29:13 S501LH_B_RR_BT 22019-12-26 12:30:59 PROD -10.3 52.7 501 52.7 2019-12-26 12:30:07 - -2019-12-26 12:32:36 PROD -35.8 27.2 503 27.2 2019-12-26 12:32:09 SHT_B_S503_BEAM 21 S503 LH_B_RR_T 22019-12-26 12:33:54 PROD -53.3 9.7 504 9.7 2019-12-26 12:33:45 SHT_B_S504 21 2019-12-26 12:35:04 PROD -24.6 38.4 505 38.4 2019-12-26 12:34:26 - -

查看完整描述

1 回答

30秒到达战场

TA贡献1828条经验获得超6个赞

您可以使用熊猫和一些矩阵乘法来解决该解决方案

import pandas as pd

# Attempt #5: Use python and the pandas package

# create the pandas Data Frames (kind of like R data.frame)

myDataDF = pd.DataFrame({'Record':range(1,6), 'SomeValue':[10, 8, 14, 6, 2]})

linkTableDF = pd.DataFrame({'ValueOfInterest':['a', 'b', 'c'], 'LowerBound': [1, 4, 10],

'UpperBound':[3, 5, 16]})

# set the index of the linkTable (kind of like setting row names)

linkTableDF = linkTableDF.set_index('ValueOfInterest')

# now apply a function to each row of the linkTable

# this function checks if any of the values in myData are between the upper

# and lower bound of a specific row thus returning 5 values (length of myData)

mask = linkTableDF.apply(lambda r: myDataDF.SomeValue.between(r['LowerBound'],

r['UpperBound']), axis=1)

# mask is a 3 (length of linkTable) by 5 matrix of True/False values

# by transposing it we get the row names (the ValueOfInterest) as the column names

mask = mask.T

# we can then matrix multiply mask with its column names

myDataDF['ValueOfInterest'] = mask.dot(mask.columns)

在您的情况下，您可以使用

mask = table.apply(lambda r: table2.AlarmInTimeStamp.between(r['start_time'],

r['End_time']), axis=1)

或者，您可以对表使用 SQL

来源： https://www.mango-solutions.com/in-between-a-rock-and-a-conditional-join/

反对回复 2022-09-20

热搜

最近搜索清空

有没有办法加入下面提到的两个数据表在python

有没有办法加入下面提到的两个数据表在python

1 回答

添加回答