1 回答

TA贡献1848条经验 获得超6个赞
要使用库来计算距离,您需要在统一系统上。来自谷歌,我相信你正在使用epsg:21781
首先使用标准化坐标系
pyproj
做颜色和形状的笛卡尔积
使用计算这些之间的距离
geopy
您现在可以选择您想要的结果行。举个例子,我在按颜色和形状分组时采取了最近的做法
import pyproj, geopy.distance
df1 = pd.DataFrame({'Ecode': [2669827.294, 2669634.483, 2669766.266, 2669960.683],
'Ncode': [1261034.528, 1262412.587, 1261209.646, 1262550.374],
'shape': ['square', 'square', 'triangle', 'circle']})
df2 = pd.DataFrame({'CoorE': [2669636, 2669765, 2669827, 2669961],
'CoorN': [1262413, 1261211, 1261032, 1262550],
'color': ['purple', 'blue', 'blue', 'yellow']})
# assuming this co-ord system https://epsg.io/21781 then mapping to https://epsg.io/4326
sc = pyproj.Proj("epsg:21781")
dc = pyproj.Proj("epsg:4326")
df1 = df1.assign(
shape_gps=lambda x: x.apply(lambda r: pyproj.transform(sc, dc, r["Ecode"], r["Ncode"]), axis=1)
)
df2 = df2.assign(
color_gps=lambda x: x.apply(lambda r: pyproj.transform(sc, dc, r["CoorE"], r["CoorN"]), axis=1)
)
(df1
.assign(foo=1)
.merge(df2.assign(foo=1), on="foo")
.assign(distance=lambda x: x.apply(lambda r:
geopy.distance.geodesic(r["color_gps"], r["shape_gps"]).km, axis=1))
.sort_values("distance")
.groupby(["color","shape"]).agg({"distance":"first","CoorE":"first","CoorN":"first"})
)
为最近的合并更新
如果你选择一个参考点来计算距离,你会得到你想要的。
import pyproj, geopy.distance
df1 = pd.DataFrame({'Ecode': [2669827.294, 2669634.483, 2669766.266, 2669960.683],
'Ncode': [1261034.528, 1262412.587, 1261209.646, 1262550.374],
'shape': ['square', 'square', 'triangle', 'circle']})
df2 = pd.DataFrame({'CoorE': [2669636, 2669765, 2669827, 2669961],
'CoorN': [1262413, 1261211, 1261032, 1262550],
'color': ['purple', 'blue', 'blue', 'yellow']})
# assuming this co-ord system https://epsg.io/21781 then mapping to https://epsg.io/4326
sc = pyproj.Proj("epsg:21781")
dc = pyproj.Proj("epsg:4326")
# pick a reference point for use in diatnace calcs
refpoint = pyproj.transform(sc, dc, df1.loc[0,["Ecode"]][0], df1.loc[0,["Ncode"]][0])
df1 = df1.assign(
shape_gps=lambda x: x.apply(lambda r: pyproj.transform(sc, dc, r["Ecode"], r["Ncode"]), axis=1),
distance=lambda x: x.apply(lambda r: geopy.distance.geodesic(refpoint, r["shape_gps"]).km, axis=1),
).sort_values("distance")
df2 = df2.assign(
color_gps=lambda x: x.apply(lambda r: pyproj.transform(sc, dc, r["CoorE"], r["CoorN"]), axis=1),
distance=lambda x: x.apply(lambda r: geopy.distance.geodesic(refpoint, r["color_gps"]).km, axis=1),
).sort_values("distance")
# no cleanup of columns but this works
pd.merge_asof(df1, df2, on="distance", direction="nearest")
添加回答
举报