为了账号安全,请及时绑定邮箱和手机立即绑定

如何在没有任何标识的情况下从代码中选择第二个 div?

如何在没有任何标识的情况下从代码中选择第二个 div?

慕容森 2022-06-22 16:32:27
我不明白我需要做什么才能使用 bs4 将第二个 div 放入第二个 div。我需要获取带有日期的 div。感谢您的帮助。这是代码:<div class="featured-item-meta">    <div><strong>Published:</strong></div>    <div>October 14, 2015</div>    <ul class="creatorList">        <li>            <div><strong>Writer:</strong></div>            <div><a href="https://www.marvel.com/comics/creators/10329/g_willow_wilson">G. Willow Wilson</a>, <a href="https://www.marvel.com/comics/creators/12441/marguerite_bennett">Marguerite  Bennett</a></div>        </li>        <li>            <div><strong>Cover Artist:</strong></div>            <div><a href="https://www.marvel.com/comics/creators/8825/jorge_molina">Jorge  Molina</a></div>        </li>    </ul></div>
查看完整描述

3 回答

?
人到中年有点甜

TA贡献1895条经验 获得超7个赞

使用 bs4 4.7.1 + 这很容易。您可以使用:hasand:contains获取具有包含字符串div的子项的父项,然后使用相邻的兄弟组合器获取下一个。strongPublished:div


from bs4 import BeautifulSoup


html = '''

<div class="featured-item-meta">

    <div><strong>Published:</strong></div>

    <div>October 14, 2015</div>

    <ul class="creatorList">

        <li>

            <div><strong>Writer:</strong></div>

            <div><a href="https://www.marvel.com/comics/creators/10329/g_willow_wilson">G. Willow Wilson</a>, <a href="https://www.marvel.com/comics/creators/12441/marguerite_bennett">Marguerite  Bennett</a></div>

        </li>

        <li>

            <div><strong>Cover Artist:</strong></div>

            <div><a href="https://www.marvel.com/comics/creators/8825/jorge_molina">Jorge  Molina</a></div>

        </li>

    </ul>

</div>

'''

soup = bs(html, 'lxml')

print(soup.select_one('div:has(strong:contains("Published:")) + div').text)


查看完整回答
反对 回复 2022-06-22
?
慕侠2389804

TA贡献1719条经验 获得超6个赞

这是一个解决方法


text = '<div class="featured-item-meta">\

<div><strong>Published:</strong></div>\

<div>October 14, 2015</div>\

<ul class="creatorList">\

    <li>\

        <div><strong>Writer:</strong></div>\

        <div><a href="https://www.marvel.com/comics/creators/10329/g_willow_wilson">G. Willow Wilson</a>, <a href="https://www.marvel.com/comics/creators/12441/marguerite_bennett">Marguerite  Bennett</a></div>\

    </li>\

    <li>\

        <div><strong>Cover Artist:</strong></div>\

        <div><a href="https://www.marvel.com/comics/creators/8825/jorge_molina">Jorge  Molina</a></div>\

    </li>\

</ul>\

</div>'


soap = BeautifulSoup(text,'html.parser')


print(soap.find('div',attrs={'class':'featured-item-meta'})\

          .find_all('div')[1].text)

输出:


October 14, 2015


查看完整回答
反对 回复 2022-06-22
?
慕码人2483693

TA贡献1860条经验 获得超9个赞

from  bs4 import BeautifulSoup as bsp

s = '''

<div class="featured-item-meta">

    <div><strong>Published:</strong></div>

    <div>October 14, 2015</div>

    <ul class="creatorList">

        <li>

            <div><strong>Writer:</strong></div>

            <div><a href="https://www.marvel.com/comics/creators/10329/g_willow_wilson">G. Willow Wilson</a>, <a href="https://www.marvel.com/comics/creators/12441/marguerite_bennett">Marguerite  Bennett</a></div>

        </li>

        <li>

            <div><strong>Cover Artist:</strong></div>

            <div><a href="https://www.marvel.com/comics/creators/8825/jorge_molina">Jorge  Molina</a></div>

        </li>

    </ul>

</div>

'''

print(bsp(s).find('div').findChildren('div')[1])

代码可能会根据您的完整网页及其结构略有变化。


查看完整回答
反对 回复 2022-06-22
  • 3 回答
  • 0 关注
  • 213 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号