我希望将以下值与 XML 文件隔离开来(https://digitallibrary.un.org/search?ln=en&p=A/RES/72/266&f=&rm=&ln=en&sf=&so=d&rg=50&c=United+Nations+Digital+Library+System&of=xm&fti=0&fti=0)。<collection> <record> ... <datafield tag="993" ind1="2" ind2=" "> <subfield code="a">A/C.5/72/L.22</subfield> # Value to isolate: A/C.5/72/L.22 </datafield> <datafield tag="993" ind1="3" ind2=" "> <subfield code="a">A/72/682</subfield> # Value to isolate: A/72/682 </datafield> <datafield tag="993" ind1="4" ind2=" "> <subfield code="a">A/72/PV.76</subfield> # Value to isolate: A/72/PV.76 </datafield> ... </record> <record> ... <datafield tag="993" ind1="2" ind2=" "> <subfield code="a">A/C.5/72/L.22</subfield> # Value to isolate: A/C.5/72/L.22 </datafield> <datafield tag="993" ind1="3" ind2=" "> <subfield code="a">A/72/682</subfield> # Value to isolate: A/72/682 </datafield> </record> ...</collection>我准备的代码似乎只为每条记录标识了带有标记 993 的第一项。for record in root: if record.find("{http://www.loc.gov/MARC21/slim}datafield[@tag='993']/{http://www.loc.gov/MARC21/slim}subfield[@code='a']") is not None: symbol = record.find("{http://www.loc.gov/MARC21/slim}datafield[@tag='993']/{http://www.loc.gov/MARC21/slim}subfield[@code='a']").text print symbol有没有办法循环使用元素树的xpath搜索多个属性?提前感谢您。
2 回答
FFIVE
TA贡献1797条经验 获得超6个赞
要完成用户3091877的答案,请备用 XPath 选项:
//*[name()="subfield"][@code="a"][parent::*[@tag="993"]]/text()
编辑:这个将返回6个值(@tag = 993和@ind1 = 3):
//*[name()="subfield"][parent::*[@tag="993" and @ind1="3"]]/text()
慕码人8056858
TA贡献1803条经验 获得超6个赞
文档显示 仅获取第一个匹配的子元素。听起来像你想要..find().findall()
以下似乎对我有用:
import xml.etree.ElementTree as ET
tree = ET.parse(input_file)
root = tree.getroot()
for record in root:
xpath = "{http://www.loc.gov/MARC21/slim}datafield[@tag='993']/{http://www.loc.gov/MARC21/slim}subfield[@code='a']"
if record.findall(xpath) is not None:
for symbol in record.findall(xpath):
print symbol.text
添加回答
举报
0/150
提交
取消
