结果只输出了源网址,然后就craw failed
代码对比的跟老师的一样了

代码对比的跟老师的一样了

 
                            2018-11-18
我和你的错误一样,去掉try块之后,显示html_parser中的get_text()有错误,
Traceback (most recent call last):
  File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 41, in <module>
    obj_spider.craw(root_url)      #启动爬虫
  File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\spider_main.py", line 23, in craw
    new_urls, new_data =self.parser.parse(new_url,html_cont)    
  File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 40, in parse
    new_data = self._get_new_data(page_url,soup)
  File "G:\eclipse-workspace(JAVAEE)\Python01\baike_spider\html_parser.py", line 27, in _get_new_data
    res_data['title'] =title_node.get_text()
AttributeError: 'NoneType' object has no attribute 'get_text'
举报