问题

// 提取链接的代码

英文:

Below is my code to extract links from a given link and my issue is when we view the source of the given Url there is a link with domain https://fs1.pdisk.pro:183 , but when i extracted links its not coming.

&lt;?php
function extractLinks($url) {

  // Get the HTML content of the page.
  $html = file_get_contents($url);

  // Create a DOMDocument object.
  $dom = new DOMDocument();
  @$dom-&gt;loadHTML($html);

  // Get all the anchor elements.
  $anchors = $dom-&gt;getElementsByTagName(&#39;source&#39;);

  // Create an array to store the links.
  $links = array();

  // Loop through the anchor elements.
  foreach ($anchors as $anchor) 
  {
    // Get the href attribute of the anchor element.
    $href = $anchor-&gt;getAttribute(&#39;src&#39;);

    // Add the link to the links array.
    $links[] = $href;
  }

  // Return the links array as JSON.
  return json_encode($links);
}

// Get the URL of the website to extract links from.
$url = &#39;http://pdisk.investro1.com/how-to-buy-life-insurance-online-qfevac8cq8x4.html&#39;;

// Extract the links from the website.
$links = extractLinks($url);

// Print the links in JSON format.
echo json_encode($links);

Can someone help me to extract the all the needed domain link from the given url and if possible redirect to the link of that domain link which is extracted from the given url and give response in json format url=link like this.

答案1

得分: 0

你正在请求一段用于抓取网站内容的代码。
未经源所有者同意获取特定内容是非法的。

换句话说，带有:183端口的链接，如果不在<a>标签下，而是在<video>--><source>标签下。

请更正以下代码行：
$anchors = $dom->getElementsByTagName('a');
改为
$anchors = $dom->getElementsByTagName('source');

同时将以下代码行：
$href = $anchor->getAttribute('href');
改为
$href = $anchor->getAttribute('src');

注意：
网络抓取需要从源网站提取数据的所有者许可。

英文:

You are asking a code to scrape a website.
This is illegal to get certain contents without the source owner's concern.

By saying this, the links with :183 port, if not under <a> tag. Its under <video>--><source> tag.

Please correct your line
$anchors = $dom->getElementsByTagName('a'); accordingly to $anchors = $dom->getElementsByTagName('source');.

Also change the line $href = $anchor->getAttribute('href'); to $href = $anchor->getAttribute('src');.

Beware :
Web Scrapping need owner's permission to extract data from source website.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从网站的HTML中提取特定领域的链接。

问题

答案1

下拉菜单中的嵌套子项与主要项重叠

PHP的`flush()`在从终端/命令行运行时是否有任何作用？

pagination php is not working, getting stuck on the first one, even tho if i echo the variables that store the page data are correct

正则表达式匹配字符串中的两个值，使用 PHP。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论