在Scrapy中抓取元素之间的文本

Python

米脂 2023-10-06 19:18:48

我正在使用 Scrapy，我正在尝试抓取这样的内容：<html> <div class='hello'> some elements . . . </div> <div class='hi there'> <div> <h3> title </h3> <h4> another title </h4> <p> some text ..... </p> "some text without any tag" <div class='article'> some elements . . </div> <div class='article'> some elements . . </div> <div class='article'> some elements . . </div> </div> </div></html>如果我想从类名“hi There”的 div 下以及类名“article”的 div 之前的所有元素中提取文本，是否有任何可能的方法使用 XPath 或 CSS 选择器进行枯萎？

查看完整描述

1 回答

倚天杖

TA贡献1828条经验获得超3个赞

没用过Scrapy。

不知道它有什么功能，但是

//div[@class='hi there']/div/(div[@class='article'])[1]/preceding-sibling::*

挑选出 div 之前具有“article”类的元素，并且，

//div[@class='hi there']/div/(div[@class='article'])[1]/preceding-sibling::text()

在文章 div 之前为您提供内部文本。

反对回复 2023-10-06

热搜

最近搜索清空

在Scrapy中抓取元素之间的文本

在Scrapy中抓取元素之间的文本

1 回答

添加回答