问题

我试图获取Rust文档中搜索的结果。我制作了以下代码来执行此操作：

let HTMLParser = require('node-html-parser');
let https = require('https');
const search = "foo";
let options = {
    host: "doc.rust-lang.org",
    path: "/std/index.html?search=" + search
};

let request = https.get(options, (res) => {
    if (res.statusCode != 200) return console.log(`发生错误：${res.statusCode}。请稍后重试。`);
    res.setEncoding("utf8");

    let output = "";

    res.on("data", (chunk) => {
        output += chunk
    });

    res.on("end", () => {
        let root = HTMLParser.parse(output);
        console.log(root.querySelector(".search-results")); // 打印 "null"，因为搜索在请求响应到达时尚未完成
    });

    request.end();
});

但当我运行这段代码时，我得到了index.html页面的HTML内容，就好像我请求了这个页面而没有?search="foo"。我发现，当我们搜索某些内容时，页面会动态更改，然后基本内容被隐藏，搜索框变得可见。因此，似乎在我获取请求结果时，JS没有加载，但我需要它来获取文档中的搜索结果。我不知道该怎么办。谢谢您提前的答复！

英文:

I'm trying to get the results of a search in the Rust documentation. I made this code to do it :

let HTMLParser = require(&#39;node-html-parser&#39;);
let https = require(&#39;https&#39;);
const search = &quot;foo&quot;;
let options = {
    host: &quot;doc.rust-lang.org&quot;,
    path: &quot;/std/index.html?search=&quot; + search
};

let request = https.get(options, (res) =&gt; {
    if (res.statusCode != 200) return console.log(`An error occured : ${res.statusCode}. Retry later.`);
    res.setEncoding(&quot;utf8&quot;);

    let output = &quot;&quot;;

    res.on(&quot;data&quot;, (chunk) =&gt; {
        output += chunk
    });

    res.on(&quot;end&quot;, () =&gt; {
        let root = HTMLParser.parse(output);
        console.log(root.querySelector(&quot;.search-results&quot;)); // print &quot;null&quot; because the search is not done when the request response come
    });

    request.end();
});

But when I run this code, I get the HTML content of the index.html page like if I requested this page without the ?search="foo". I found that the page change dynamically with some JS when we search for something, and then the base content is set to hidden and the search div become visible. So it seems that the JS didn't load when I get the request result, but I needs it to get the results of the search in the documentation. I don't know how I can do that.
Thank you in advance for your answers !

答案1

得分: 1

Rust文档页面在执行搜索时似乎不会与后端通信。我使用浏览器开发者工具注意到了这一点。

看起来页面加载了一个包含现成文档的search-index。您可以使用此JavaScript来搜索文档。逻辑写在main.js中。

如果您想要更多信息，请告诉我，因为我还没有找到如何创建每个文档项上的链接生成方式。

编辑

构建URL所需的所有逻辑都在main.js中。方法如下所示。如果您仔细查看aliases.js、main.js、storage.js和search-index.js文件，几乎可以重复使用其中的所有内容来创建链接和所需的搜索输出。

function buildHrefAndPath(item) {
      var displayPath;
      var href;
      var type = itemTypes[item.ty];
      var name = item.name;
      if (type === 'mod') {
        displayPath = item.path + '::';
        href = rootPath + item.path.replace(/::/g, '/') + '/' + name + '/index.html'
      } else if (type === 'primitive' || type === 'keyword') {
        displayPath = '';
        href = rootPath + item.path.replace(/::/g, '/') + '/' + type + '.' + name + '.html'
      } else if (type === 'externcrate') {
        displayPath = '';
        href = rootPath + name + '/index.html'
      } else if (item.parent !== undefined) {
        var myparent = item.parent;
        var anchor = '#' + type + '.' + name;
        var parentType = itemTypes[myparent.ty];
        if (parentType === 'primitive') {
          displayPath = myparent.name + '::'
        } else {
          displayPath = item.path + '::' + myparent.name + '::'
        }
        href = rootPath + item.path.replace(/::/g, '/') + '/' + parentType + '.' + myparent.name + '.html' + anchor
      } else {
        displayPath = item.path + '::';
        href = rootPath + item.path.replace(/::/g, '/') + '/' + type + '.' + name + '.html'
      }
      return [displayPath,
      href]
    }

英文:

The Rust doc page does not seem to hit a backend when a search is performed. I noticed this using the browser developer tools.

It looks like the page loads a search-index which contains the readily available docs. You can use this js to search for docs. The logic is written in the main.js.

Let me know if you are looking for more info, as I have not found out how the link generation on each doc item is created.

EDIT

All the logic required to build the url is in main.js. The method is as follows. If you take a close look at the aliases.js, main.js, storage.js and search-index.js files, you can reuse almost all of it to create the links and the required search outputs.

function buildHrefAndPath(item) {
      var displayPath;
      var href;
      var type = itemTypes[item.ty];
      var name = item.name;
      if (type === &#39;mod&#39;) {
        displayPath = item.path + &#39;::&#39;;
        href = rootPath + item.path.replace(/::/g, &#39;/&#39;) + &#39;/&#39; + name + &#39;/index.html&#39;
      } else if (type === &#39;primitive&#39; || type === &#39;keyword&#39;) {
        displayPath = &#39;&#39;;
        href = rootPath + item.path.replace(/::/g, &#39;/&#39;) + &#39;/&#39; + type + &#39;.&#39; + name + &#39;.html&#39;
      } else if (type === &#39;externcrate&#39;) {
        displayPath = &#39;&#39;;
        href = rootPath + name + &#39;/index.html&#39;
      } else if (item.parent !== undefined) {
        var myparent = item.parent;
        var anchor = &#39;#&#39; + type + &#39;.&#39; + name;
        var parentType = itemTypes[myparent.ty];
        if (parentType === &#39;primitive&#39;) {
          displayPath = myparent.name + &#39;::&#39;
        } else {
          displayPath = item.path + &#39;::&#39; + myparent.name + &#39;::&#39;
        }
        href = rootPath + item.path.replace(/::/g, &#39;/&#39;) + &#39;/&#39; + parentType + &#39;.&#39; + myparent.name + &#39;.html&#39; + anchor
      } else {
        displayPath = item.path + &#39;::&#39;;
        href = rootPath + item.path.replace(/::/g, &#39;/&#39;) + &#39;/&#39; + type + &#39;.&#39; + name + &#39;.html&#39;
      }
      return [displayPath,
      href]
    }

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

获取JS加载后页面的HTML内容

问题

答案1

基本的 Express（React）用户POST请求

如何从Node中正确终止一个Go进程

在MongoDB中更新嵌套文档中的值

在不使用端口的情况下运行Node.js应用程序。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论