为了账号安全,请及时绑定邮箱和手机立即绑定

循环抓取后页的问题

2018-08-13 11:37:59 [scrapy.core.scraper] ERROR: Spider error processing <GET https://movie.douban.com/top250> (referer: None)

Traceback (most recent call last):

  File "/usr/local/lib/python3.7/site-packages/scrapy/utils/defer.py", line 102, in iter_errback

    yield next(it)

  File "/usr/local/lib/python3.7/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output

    for x in result:

  File "/usr/local/lib/python3.7/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in <genexpr>

    return (_set_referer(r) for r in result or ())

  File "/usr/local/lib/python3.7/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>

    return (r for r in result or () if _filter(r))

  File "/usr/local/lib/python3.7/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>

    return (r for r in result or () if _filter(r))

  File "/usr/local/douban/douban/spiders/douban_spider.py", line 36, in parse

    next_link = response.xpath("//span[@class='next']/link/@href").extarct()

AttributeError: 'SelectorList' object has no attribute 'extarct'


大壮老师,我根据你的教程后亲测了一下,发现在抓取后页URL时,不能正确获取到,拿到的数据只有前25条。
请大壮老师赐教一番。

正在回答

1 回答

贴上你的代码

0 回复 有任何疑惑可以回复我~

举报

0/150
提交
取消

循环抓取后页的问题

我要回答 关注问题
意见反馈 帮助中心 APP下载
官方微信