从CDPElementHandle获取可读的值。

huangapple go评论66阅读模式
英文:

how to get readable value from CDPElementHandle

问题

I want to get actual text content inside the element with class 'example'. How to extract that value? I used .getProperties with .jsonValue, but it didn't work. Any help would be appreciated.

英文:

I am just trying to scrap something from some website my code looks like it

const puppeteer = require("puppeteer") 
const main = async () => {
const browser = await puppeteer.launch({})
const page = await browser.newPage()
await page.goto("https://www.example.com")
await page.waitForSelector(".example") 
const titleNode = await page.$$(".example")
titleNode.forEach(  el => {
  el.getProperties("textContent").then(el => {
          console.log(el)
  })
})
 console.log( titleNode );
 browser.close()
}
main()

And result is something like this

[
    CDPElementHandle { handle: CDPJSHandle {} },
    CDPElementHandle { handle: CDPJSHandle {} },
    CDPElementHandle { handle: CDPJSHandle {} },
    CDPElementHandle { handle: CDPJSHandle {} },
    CDPElementHandle { handle: CDPJSHandle {} },
]

i want to get actual text content inside the element with class 'example'
how to extract that value i used .getProperties with .jsonValue but it didnt work
Any help would be appreciated

答案1

得分: 4

Array.prototype.forEach 不适用于异步代码,所以请改用 for...ofmap

const puppeteer = require("puppeteer");

const html = `
    <div>
        <a>text1</a>
        <a class='example'>text2</a>
        <a>text3</a>
        <a class='example'>text4</a>
        <a>text5</a>
        <a>text6</a>
    </div>
`;

const main = async () => {
    const browser = await puppeteer.launch({})
    const page = await browser.newPage()
    await page.setContent(html);

    const titleNode = await page.$$(".example");

    let result = [];
    for(let t of titleNode) {
        result.push(await t.evaluate(x => x.textContent));
    }

    let result2 = await Promise.all(titleNode.map(async (t) => {
        return await t.evaluate(x => x.textContent);
    }))

    console.log({result : result, result2 : result2});
}

main();
英文:

Array.prototype.forEach is not designed for asynchronous code, so instead of using .forEach use for...of, or map.

Code :

const puppeteer = require(&quot;puppeteer&quot;);

const html = `
    &lt;div&gt;
        &lt;a&gt;text1&lt;/a&gt;
        &lt;a class=&#39;example&#39;&gt;text2&lt;/a&gt;
        &lt;a&gt;text3&lt;/a&gt;
        &lt;a class=&#39;example&#39;&gt;text4&lt;/a&gt;
        &lt;a&gt;text5&lt;/a&gt;
        &lt;a&gt;text6&lt;/a&gt;
    &lt;/div&gt;
`;

const main = async () =&gt; {
    const browser = await puppeteer.launch({})
    const page = await browser.newPage()
    await page.setContent(html);

    const titleNode = await page.$$(&quot;.example&quot;);

    let result = [];
    for(let t of titleNode) {
        result.push(await t.evaluate(x =&gt; x.textContent));
    }

    let result2 = await Promise.all(titleNode.map(async (t) =&gt; {
        return await t.evaluate(x =&gt; x.textContent);
    }))


    console.log({result : result, result2 : result2});
}

main();

huangapple
  • 本文由 发表于 2023年4月19日 22:45:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76055867.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定