已解决430363个问题，去搜搜看，总会有你想问的

从 BeautifulSoup 中的文本文件中检索抓取网址

首页猿问从 BeautifulSoup...

从 BeautifulSoup 中的文本文件中检索抓取网址

Python

HUWWW 2023-02-15 17:16:56

我有以下脚本，我想从文本文件而不是数组中检索 URL。我是 Python 的新手，一直被卡住！from bs4 import BeautifulSoupimport requestsurls = ['URL1', 'URL2', 'URL3']for u in urls: response = requests.get(u) data = response.text soup = BeautifulSoup(data,'lxml')

查看完整描述

1 回答

富国沪深

TA贡献1790条经验获得超9个赞

你能更清楚你想要什么吗？

这是一个可能的答案，可能是也可能不是您想要的：

from bs4 import BeautifulSoup

import requests

with open('yourfilename.txt', 'r') as url_file:

for line in url_file:

u = line.strip()

response = requests.get(u)

data = response.text

soup = BeautifulSoup(data,'lxml')

文件是用open()函数打开的；第二个参数是'r'指定我们以只读模式打开它。对的调用open()被封装在一个with块中，因此一旦您不再需要打开文件，文件就会自动关闭。该strip()函数删除每行开头和结尾的尾随空格（空格、制表符、换行符），立即' https://stackoverflow.com '.strip()变为'https://stackoverflow.com'.

反对回复 2023-02-15