如何在php中获取包含外部API的网页

我有一个 php 脚本，可以加载此网页以从其表中提取一些数据。以下方法无法获取其表内容：使用 file_get_contents：$document -> file_get_contents("http://www.webpage.com/");print_r($document);使用卷曲：$document = curl_init('http://www.webpage.com/');curl_setopt($document, CURLOPT_RETURNTRANSFER, true);$html = curl_exec($document);print_r($html);使用loadHTMLFile：$document->loadHTMLFile('http://www.webpage.com/');print_r($document);我究竟做错了什么？以及它们如何阻止某些内容加载？

查看完整描述

2 回答

叮当猫咪

TA贡献1776条经验获得超12个赞

这不是您可能想听到的答案，但您描述的方法都不会像普通浏览器客户端那样评估 JavaScript 和其他浏览器资源。相反，每个方法仅检索您指定的文件的内容。快速浏览一下您所定位的网站，可以清楚地看到该表是作为 AJAX 调用的结果填充的，而您尝试过的任何方法都无法评估该结果。

您需要依赖具有此类模拟功能的库或脚本；即Selenium webdriverlaravel/dusk的 PHP 绑定，或类似的东西。

反对回复 2023-06-24

天涯尽头无女友

TA贡献1831条经验获得超9个赞

这就是我使用 php curl 从网页中抓取数据的方法：

// Defining the basic cURL function

function curl($url) {

$ch = curl_init(); // Initialising cURL

curl_setopt($ch, CURLOPT_URL, $url); // Setting cURL's URL option with the $url variable passed into the function

curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Setting cURL's option to return the webpage data

$data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable

curl_close($ch); // Closing cURL

return $data; // Returning the data from the function

}

// Defining the basic scraping function

function scrape_between($data, $start, $end){

$data = stristr($data, $start); // Stripping all data from before $start

$data = substr($data, strlen($start)); // Stripping $start

$stop = stripos($data, $end); // Getting the position of the $end of the data to scrape

$data = substr($data, 0, $stop); // Stripping all data from after and including the $end of the data to scrape

return $data; // Returning the scraped data from the function

}

$target_url = "https://www.somesite.com";

$scraped_website = curl($target_url);

$data_set_1 = scrape_between($scraped_website, "%before%", "%after%");

$data_set_2 = scrape_between($scraped_website, "%before%", "%after%");

%before% 和 %after% 是始终显示在网页上您要抓取的数据之前和之后的数据。可能是 div 标签或一些其他 html 标签，这些标签对于您想要获取的数据来说是唯一的。

反对回复 2023-06-24

有只小跳蛙

TA贡献1824条经验获得超8个赞

那么也许可以考虑使用curl 并模仿该网站正在使用的相同ajax 请求？当我搜索时，这就是我发现的： Mimicing an ajax call with Curl PHP

反对回复 2023-06-24

热搜

最近搜索清空

如何在php中获取包含外部API的网页

如何在php中获取包含外部API的网页

2 回答

添加回答