所以假设我有一个retailer_info这样的数据框: price product_name url0 5005 Intel Pentium Gold G5400 3.70 GHz Processor https://www.theitdepot.com/details-Intel+Penti...1 7150 Intel Core i3-9100F 3.60 GHz Processor https://www.theitdepot.com/details-Intel+Core+...2 8210 AMD Ryzen 3 2200G with Radeon Vega 8 Graphics https://www.theitdepot.com/details-AMD+Ryzen+3...3 8415 AMD Ryzen 3 3200G with Radeon Vega 8 Graphics https://www.theitdepot.com/details-AMD+Ryzen+3...4 10330 AMD Ryzen 5 1600 3.2 GHz Processor https://www.theitdepot.com/details-AMD+Ryzen+5...我有另一个数据框,cpu_info如下所示: Type Part Number Brand Model Rank92 CPU YD1600BBAEBOX AMD Ryzen 5 1600 9396 CPU YD250XBBM4KAF AMD Ryzen 5 2500X 97108 CPU YD3200C5FHBOX AMD Ryzen 3 3200G 109129 CPU YD150XBBAEBOX AMD Ryzen 5 1500X 130138 CPU YD2400C5FBBOX AMD Ryzen 5 2400G 139139 CPU YD2200C5FBBOX AMD Ryzen 3 2200G 140153 CPU YD130XBBAEBOX AMD Ryzen 3 1300X 154现在对于系列中的每个值cpu_info['Model'],我需要检查它是否是系列中任何值的子字符串,retailer_info['product_name']如果是,我想将urldf中的列合并retailer_info到 dataframe cpu_info。预期结果: Type Part Number Brand Model Rank url92 CPU YD1600BBAEBOX AMD Ryzen 5 1600 93 https://www.theitdepot.com/details-AMD+Ryzen+5...96 CPU YD250XBBM4KAF AMD Ryzen 5 2500X 97 NaN108 CPU YD3200C5FHBOX AMD Ryzen 3 3200G 109 https://www.theitdepot.com/details-AMD+Ryzen+3...129 CPU YD150XBBAEBOX AMD Ryzen 5 1500X 130 NaN138 CPU YD2400C5FBBOX AMD Ryzen 5 2400G 139 NaN139 CPU YD2200C5FBBOX AMD Ryzen 3 2200G 140 https://www.theitdepot.com/details-AMD+Ryzen+3...153 CPU YD130XBBAEBOX AMD Ryzen 3 1300X 154 NaN我意识到new_df = pd.merge(cpu, it['product_name', 'url'], on='', how='left')仅当您只想基于列值进行合并时才有效。我不确定如何达到我想要的结果。我真的很感激任何帮助。谢谢。
2 回答

qq_笑_17
TA贡献1818条经验 获得超7个赞
可以添加多个条件:
dicc = pd.Series(retailer_info["url"].values,index=retailer_info["product_name"]).to_dict()
cpu_info["url"] = ""
for index, row in cpu_info.iterrows():
for key in dicc:
if row["Brand"] in key and row["Model"] in key:
cpu_info.at[index, "url"] = dicc[key]
break

喵喵时光机
TA贡献1846条经验 获得超7个赞
尝试这个。它应该工作
def find_url(model_name):
try:
return retailer_info[retailer_info['product_name'].str.contains(model_name)]['address'].values[0]
except:
return None
cpu_info['url'] = cpu_info['Model'].apply(model_name)
添加回答
举报
0/150
提交
取消