如何获取锚点元素内的图像详细信息:

huangapple go评论44阅读模式
英文:

How to get image details within an anchor element <a><img src></a>

问题

我正在使用Node.js和jsdom,尝试获取锚标签内使用的图像。

我可以使用.querySelectorAll("img");列举图像和使用.querySelectorAll("img");列举锚标签。

但我似乎找不到两者之间的关系,这正是我想知道的部分,知道点击时显示的图像导航到x。

示例HTML

<a href="http://www.yahoo.com">
  <img src="https://s.yimg.com/nq/nr/img/yahoo_mail_global_english_white_1x.png" alt="Yahoo Mail Image">
</a>

Node.js

var links = dom.window.document.querySelectorAll("a");
    links.forEach(function(value){
         console.log('链接到 x 显示为图像.alt "yahoo mail image" 和图像.src "https://...."');
         console.log('主机: ' + value.hostname);
         console.log('超链接: ' + value.href);
         console.log('文本: ' + value.text);
         console.log('HTML: ');
         console.dir(value);
    });

预期结果:

链接到 x 显示为图像.alt "yahoo mail image" 和图像.src "https://...."

英文:

I am using nodejs and jsdom, attempting to retrieve images used within anchor tags.

I can enumerate the images using .querySelectorAll(&quot;img&quot;); and the anchors using .querySelectorAll(&quot;img&quot;);.

But I can't seem to find the relationship between the two which is the part that I am after, to know that the image displayed when clicked navigates to x.

sample html

&lt;a href=&quot;http://www.yahoo.com&quot;&gt;
  &lt;img src=&quot;https://s.yimg.com/nq/nr/img/yahoo_mail_global_english_white_1x.png&quot; alt=&quot;Yahoo Mail Image&quot;&gt;
&lt;/a&gt;

Node.js

var links = dom.window.document.querySelectorAll(&quot;a&quot;);
    links.forEach(function(value){
         console.log(&#39;Host: &#39; + value.hostname);
         console.log(&#39;Href: &#39; + value.href);
         console.log(&#39;Text: &#39; + value.text);
         console.log(&#39;HTML: &#39;);
         console.dir(value);
    });

Expected result:

link to x is displayed with image.alt "yahoo mail image" and image.src "https://...."

答案1

得分: 1

以下是您要翻译的内容:

Without seeing your HTML context, I can suggest running queries within the link subtrees:

```js
const {JSDOM} = require("jsdom"); // ^22.0.0

const html = `
<a href="http://www.yahoo.com">
  <img src="https://s.yimg.com/nq/nr/img/yahoo_mail_global_english_white_1x.png" alt="Yahoo Mail Image">
</a>
<a href="http://www.google.com">
  <img src="google.png" alt="Google Image">
</a>
<a href="http://www.example.com">
  <img src="whatever.png" alt="Whatever Image">
</a>`;

const {window: {document}} = new JSDOM(html);
const data = [...document.querySelectorAll("a")].map(e => ({
  src: e.querySelector("img").src,
  alt: e.querySelector("img").getAttribute("alt"),
  href: e.href,
}));
console.log(data);

Output:

[
  {
    src: 'https://s.yimg.com/nq/nr/img/yahoo_mail_global_english_white_1x.png',
    alt: 'Yahoo Mail Image',
    href: 'http://www.yahoo.com/'
  },
  {
    src: 'google.png',
    alt: 'Google Image',
    href: 'http://www.google.com/'
  },
  {
    src: 'whatever.png',
    alt: 'Whatever Image',
    href: 'http://www.example.com/'
  }
]

However, it's likely that there are other links on the page you're working with, so I would add a parent container to refine your a selector, which is probably too broad and will attempt to grab links that don't have <img> tags as children.

Using the sizzle pseudoselector a:has(img), xpath, or a fiter (shown below) might also help:

const data = [...document.querySelectorAll("a")]
  .filter(e => e.querySelector(":scope > img"))
  .map(e => ({
    src: e.querySelector("img").src,
    alt: e.querySelector("img").getAttribute("alt"),
    href: e.href,
  }));

...but this is speculation.


<details>
<summary>英文:</summary>

Without seeing your HTML context, I can suggest running queries within the link subtrees:

```js
const {JSDOM} = require(&quot;jsdom&quot;); // ^22.0.0

const html = `
&lt;a href=&quot;http://www.yahoo.com&quot;&gt;
  &lt;img src=&quot;https://s.yimg.com/nq/nr/img/yahoo_mail_global_english_white_1x.png&quot; alt=&quot;Yahoo Mail Image&quot;&gt;
&lt;/a&gt;
&lt;a href=&quot;http://www.google.com&quot;&gt;
  &lt;img src=&quot;google.png&quot; alt=&quot;Google Image&quot;&gt;
&lt;/a&gt;
&lt;a href=&quot;http://www.example.com&quot;&gt;
  &lt;img src=&quot;whatever.png&quot; alt=&quot;Whatever Image&quot;&gt;
&lt;/a&gt;`;

const {window: {document}} = new JSDOM(html);
const data = [...document.querySelectorAll(&quot;a&quot;)].map(e =&gt; ({
  src: e.querySelector(&quot;img&quot;).src,
  alt: e.querySelector(&quot;img&quot;).getAttribute(&quot;alt&quot;),
  href: e.href,
}));
console.log(data);

Output:

[
  {
    src: &#39;https://s.yimg.com/nq/nr/img/yahoo_mail_global_english_white_1x.png&#39;,
    alt: &#39;Yahoo Mail Image&#39;,
    href: &#39;http://www.yahoo.com/&#39;
  },
  {
    src: &#39;google.png&#39;,
    alt: &#39;Google Image&#39;,
    href: &#39;http://www.google.com/&#39;
  },
  {
    src: &#39;whatever.png&#39;,
    alt: &#39;Whatever Image&#39;,
    href: &#39;http://www.example.com/&#39;
  }
]

However, it's likely that there are other links on the page you're working with, so I would add a parent container to refine your a selector, which is probably too broad and will attempt to grab links that don't have &lt;img&gt; tags as children.

Using the sizzle pseudoselector a:has(img), xpath, or a fiter (shown below) might also help:

const data = [...document.querySelectorAll(&quot;a&quot;)]
  .filter(e =&gt; e.querySelector(&quot;:scope &gt; img&quot;))
  .map(e =&gt; ({
    src: e.querySelector(&quot;img&quot;).src,
    alt: e.querySelector(&quot;img&quot;).getAttribute(&quot;alt&quot;),
    href: e.href,
  }));

...but this is speculation.

huangapple
  • 本文由 发表于 2023年5月24日 19:50:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76323205.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定