如何在cheerio中获取backgroundImage的URL

huangapple go评论104阅读模式
英文:

How to get backgroundImage url in cheerio

问题

我试图获取我的Anime-Planet帐户的横幅,用于我的爬虫系统。
我尝试了我所知道的一切,使用cheerio,但我无法获取profileBackgroundbackground-image URL。
属性

我尝试了

  1. async function Al() {
  2. const cheerio = require("cheerio");
  3. const url = "https://www.anime-planet.com/users/kyoyacchi";
  4. const {data} = await client.axios.get(url, {
  5. headers: {
  6. "User-Agent":
  7. "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Mobile Safari/537.36",
  8. },
  9. });
  10. const $ = cheerio.load(data);
  11. return $(".pull-beta.pull-alpha.pullup.editableBanner")
  12. .find(".wrapper")
  13. .find("profileBackground");
  14. }
  15. Al();

这是结果

这个只返回了头像路径。

英文:

I was trying to get banner of my Anime-Planet account for my scraper system.
I tried everything i have know with cheerio but i couldn't get the profileBackgrounds background-image url.
Properties

I tried

  1. async function Al() {
  2. const cheerio = require("cheerio");
  3. const url = "https://www.anime-planet.com/users/kyoyacchi";
  4. const {data} = await client.axios.get(url, {
  5. headers: {
  6. "User-Agent":
  7. "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Mobile Safari/537.36",
  8. },
  9. });
  10. const $ = cheerio.load(data);
  11. return $(".pull-beta.pull-alpha.pullup.editableBanner")
  12. .find(".wrapper")
  13. .find("profileBackground");
  14. }
  15. Al();

Here is the result

This one is only returns avatar path.

答案1

得分: 0

我了解到这是$('div[id=profileBackground]')

英文:

I learned that it's $('div[id=profileBackground]')

答案2

得分: 0

你可以使用以下代码来获取图像路径:

  1. axios.get(url)
  2. .then(({data: html}) => {
  3. const $ = cheerio.load(html);
  4. const path = $("#profileBackground")
  5. .attr("style")
  6. .match(/background-image: *url *\((.+?)\)/)[1];
  7. console.log(path); // => /images/users/backgrounds/3966485.jpg?t=1660418964
  8. });

使用#some-id来选择具有id属性的元素。在CSS中,裸单词是指标签名称,所以p<p></p>的选择器。在文档中,id应该是唯一的,所以通常不需要指定父选择器。

上面的正则表达式从background-image: url后面的括号中提取内容。

英文:

You can retrieve the image path with:

  1. axios.get(url)
  2. .then(({data: html}) =&gt; {
  3. const $ = cheerio.load(html);
  4. const path = $(&quot;#profileBackground&quot;)
  5. .attr(&quot;style&quot;)
  6. .match(/background-image: *url *\((.+?)\)/)[1];
  7. console.log(path); // =&gt; /images/users/backgrounds/3966485.jpg?t=1660418964
  8. });

Use #some-id to select by the id attribute. In CSS, a bare word refers to a tag name, so p is the selector for &lt;p&gt;&lt;/p&gt;. Ids are supposed to be unique in the document, so you don't usually need to specify parent selectors.

The regex above extracts the contents from the parentheses after background-image: url.

huangapple
  • 本文由 发表于 2023年1月5日 19:03:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/75017447.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定