为了账号安全,请及时绑定邮箱和手机立即绑定

使用 bash/python 合并两个 CSV 文件

使用 bash/python 合并两个 CSV 文件

慕村9548890 2022-11-09 17:10:06
我有两个 CSV 文件需要帮助映射/合并:CSV 文件 1:"ID","Name","Flavor""45fc754d-6a9b-4bde-b7ad-be91ae60f582","test1","m1.medium""83dbc739-e436-4c9f-a561-c5b40a3a6da5","test2","m1.tiny""ef68fcf3-f624-416d-a59b-bb8f1aa2a769","test3","m1.medium"CSV 文件 2:"Name","RAM","Disk","VCPUs""m1.medium",4096,40,2"m1.xlarge",16384,160,8"m1.tiny",128,1,1理想的输出是:"ID","Name","Flavor","RAM","Disk","VCPUs""45fc754d-6a9b-4bde-b7ad-be91ae60f582","test1","m1.medium",4096,40,2"83dbc739-e436-4c9f-a561-c5b40a3a6da5","test2","m1.tiny",128,1,1"ef68fcf3-f624-416d-a59b-bb8f1aa2a769","test3","m1.medium",4096,40,2请注意,Flavor在 CSV 文件 1 中和Name在 CSV 文件 2 中是相同的。名称上的差异是用于提取信息的不同工具的结果。另请注意,CSV File2 有一个flavor/name m1.xlarge. 如上所述,如果m1.xlarge flavor/name在 CSV File1 中未找到 ,则应将其从合并输出中丢弃。我整天都在做这件事,结果好坏参半。任何想法,将不胜感激。
查看完整描述

3 回答

?
斯蒂芬大帝

TA贡献1827条经验 获得超8个赞

你可以使用这个awk:


awk -v hdr='"ID","Name","Flavor","RAM","Disk","VCPUs"' 'BEGIN {

   FS=OFS=","

   print hdr

}

NR == FNR {

   a[$1] = $2 FS $3 FS $4

   next

}

$3 in a {

   print $0, a[$3]

}' file2.csv file1.csv

"ID","Name","Flavor","RAM","Disk","VCPUs"

"45fc754d-6a9b-4bde-b7ad-be91ae60f582","test1","m1.medium",4096,40,2

"83dbc739-e436-4c9f-a561-c5b40a3a6da5","test2","m1.tiny",128,1,1

"ef68fcf3-f624-416d-a59b-bb8f1aa2a769","test3","m1.medium",4096,40,2


查看完整回答
反对 回复 2022-11-09
?
白猪掌柜的

TA贡献1893条经验 获得超10个赞

像这样的东西,但你必须玩转报价选项才能看到你喜欢什么。


#!/usr/bin/env python3


import csv


by_name = {}

with open('b.csv') as b:

    for row in csv.DictReader(b):

        name = row.pop('Name')

        by_name[name] = row


with open('c.csv', 'w') as c:

    w = csv.DictWriter(c, ['ID', 'Name', 'Flavor', 'RAM', 'Disk', 'VCPUs'])

    w.writeheader()


    with open('a.csv') as a:

        for row in csv.DictReader(a):

            try:

                match = by_name[row['Flavor']]

            except KeyError:

                continue


            row.update(match)


            w.writerow(row)

输出:


ID,Name,Flavor,RAM,Disk,VCPUs

45fc754d-6a9b-4bde-b7ad-be91ae60f582,test1,m1.medium,4096,40,2

83dbc739-e436-4c9f-a561-c5b40a3a6da5,test2,m1.tiny,128,1,1

ef68fcf3-f624-416d-a59b-bb8f1aa2a769,test3,m1.medium,4096,40,2


查看完整回答
反对 回复 2022-11-09
?
撒科打诨

TA贡献1934条经验 获得超2个赞

如果我正确理解了您的问题,并且您想根据列中的字符串将第一个文件中的行与第二个 csvFlavor中的列中具有该值的行进行匹配Name,那么这很容易做到xsv(您'可能需要先安装):


$ xsv join "Flavor" file1.csv "Name" file2.csv

ID,Name,Flavor,Name,RAM,Disk,VCPUs

45fc754d-6a9b-4bde-b7ad-be91ae60f582,test1,m1.medium,m1.medium,4096,40,2

83dbc739-e436-4c9f-a561-c5b40a3a6da5,test2,m1.tiny,m1.tiny,128,1,1

ef68fcf3-f624-416d-a59b-bb8f1aa2a769,test3,m1.medium,m1.medium,4096,40,2

您还必须删除重复的Name列,您可以xsv再次使用它:


$ xsv join "Flavor" file1.csv "Name" file2.csv | xsv select ID,Name,Flavor,RAM,Disk,VCPUs

ID,Name,Flavor,RAM,Disk,VCPUs

45fc754d-6a9b-4bde-b7ad-be91ae60f582,test1,m1.medium,4096,40,2

83dbc739-e436-4c9f-a561-c5b40a3a6da5,test2,m1.tiny,128,1,1

ef68fcf3-f624-416d-a59b-bb8f1aa2a769,test3,m1.medium,4096,40,2


查看完整回答
反对 回复 2022-11-09
  • 3 回答
  • 0 关注
  • 93 浏览

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信