为了账号安全,请及时绑定邮箱和手机立即绑定

使用 BeautifulSoup 从特定 <div> 中特定 <h3> 之后的 <ol> 中提取文本

使用 BeautifulSoup 从特定 <div> 中特定 <h3> 之后的 <ol> 中提取文本

泛舟湖上清波郎朗 2023-08-21 19:23:12
我正在尝试使用 BeautifulSoup 从此页面中的某个“ol”中提取文本。我想要获取的信息位于具有特定类的特定“div”下,但我希望列表项中的文本立即出现在某个“h3”之后,其中包含带有类和 id 的“span”。看图:输出应该是:Verb1. (transitive) To join or unite (e.g. one thing to another, or as several particulars) so as to increase the number, augment the quantity or enlarge the magnitude, or so as to form into one aggregate.2. To sum up; to put together mentally....到目前为止我所做的是:from bs4 import BeautifulSoupimport urlliburl = urllib.urlopen('https://en.wiktionary.org/wiki/add#English')content = url.read()soup = BeautifulSoup(content, 'lxml')main_div = soup.findAll('div',attrs={"class":"mw-parser-output"})for x in main_div:    all_h3 = x.findAll('h3')    all_ol = x.findAll('ol')这个问题的第一个答案可能是相关的,但我不知道如何为我的任务编辑它。
查看完整描述

1 回答

?
慕娘9325324

TA贡献1783条经验 获得超4个赞

lxml.html您可以使用XPath 表达式来代替 BeautifulSoup 。


Python


import requests

import io

from lxml import html


res = requests.get("https://en.wiktionary.org/wiki/add#English")


tree = html.parse(io.StringIO(res.text))


outputs = []


h3 = tree.xpath("//h3[span[@class = 'mw-headline' and @id = 'Verb']]")[0]


outputs.append(h3.xpath("span")[0].text)


ol = h3.xpath("following::ol[1]")[0]


outputs.append(ol.text_content())


print(outputs)

输出


['Verb',

 '(transitive) To join or unite (e.g. one thing to another, or as several particulars) so as to increase the number, augment the quantity, or enlarge the magnitude, or so as to form into one aggregate.\nTo sum up; to put together mentally.\n1689, John Locke, An Essay Concerning Human Understanding\n […] as easily as he can add together the ideas of two days or two years.\nto add numbers\n(transitive) To combine elements of (something) into one quantity.\nto add a column of numbers\n(transitive) To give by way of increased possession (to someone); to bestow (on).\n1611, King James Version, Genesis 30:24:\nThe LORD shall add to me another son.\n1667, John Milton, Paradise Lost:\nBack to thy punishment, False fugitive, and to thy speed add wings.\n(transitive) To append (e,g, a statement); to say further information.\n1855, Thomas Babington Macaulay, The History of England from the Accession of James the Second, volume 3, page 37\xa0[1]:\nHe added that he would willingly consent to the entire abolition of the tax\n1900, L. Frank Baum, The Wonderful Wizard of Oz Chapter 23\n"Bless your dear heart," she said, "I am sure I can tell you of a way to get back to Kansas." Then she added, "But, if I do, you must give me the Golden Cap."\n(intransitive) To make an addition; to augment; to increase.\n1611, King James Version, 1 Kings 12:14:\nI will add to your yoke\n2013 June 29,  “A punch in the gut”, in  The Economist, volume 407, number 8842, page 72-3:Mostly, the microbiome is beneficial. […] Research over the past few years, however, has implicated it in diseases from atherosclerosis to asthma to autism. Dr Yoshimoto and his colleagues would like to add liver cancer to that list.\nIt adds to our anxiety.\n(intransitive, mathematics) To perform the arithmetical operation of addition.\nHe adds rapidly.\n(intransitive, video games) To summon minions or reinforcements.\nTypically, a hostile mob will add whenever it\'s within the aggro radius of a player.']



查看完整回答
反对 回复 2023-08-21
  • 1 回答
  • 0 关注
  • 74 浏览

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信