对于同一个页面,几乎同样的代码,在Python3,windows8环境下能够正常解析运行。但是把代码移植到Ubuntu,Python2.7下面之后,会出现获取的网页不能被beautifulsoup解析,find_all('table')返回空节点的情况。出问题的代码的一部分(可以运行):python#coding:utf-8importsysreload(sys)sys.setdefaultencoding('utf-8')importurllib2frombs4importBeautifulSouppostdata="T1=&T2=1&T3=&T4=&T5=&APPDate=&T7=&T8=&T9=&PRDate=&T11=&SQDate=&JDDate=&T14=&T15=&T16=&T17=&SDDate=&T19=&T20=&T21=&D1=%B8%B4%C9%F3&D2=jdr&D3=%C9%FD%D0%F2&C1=fm&C2=&C3=&page=70"postdata=postdata.encode('utf-8')headers={'User-Agent':'Mozilla/5.0(Windows;U;WindowsNT6.1;en-US;rv:1.9.1.6)Gecko/20091201Firefox/3.5.6','Referer':'http://app.sipo-reexam.gov.cn/reexam_out/searchdoc/searchfs.jsp'}req=urllib2.Request(url="http://app.sipo-reexam.gov.cn/reexam_out/searchdoc/searchfs.jsp",headers=headers,data=postdata)fp=urllib2.urlopen(req)mybytes=fp.read().decode('gbk').encode('utf-8')soup=BeautifulSoup(mybytes,from_coding="uft-8")printsoup.original_encodingprintsoup.prettify()求指点一二
添加回答
举报
0/150
提交
取消
