Puppeteer:如何点击元素以在新标签页中打开?

huangapple go评论81阅读模式
英文:

Puppeteer: How to click element so it opens in new tab?

问题

Here's the translated code portion:

我有一个包含25个可点击元素的列表我需要在新标签页中打开每一个爬取新标签页中的内容然后关闭它然后转到下一个元素并对列表中的每个元素执行相同的操作

但是我在通过点击它们打开链接时遇到问题我成功地使用 `page.goto('link')` 打开了它们但我想使它更像人工操作而不是将链接粘贴到新标签页中

let accountsClickElements = await page.$$eval(".result-lockup__name a", el => el.map(x => x.getAttribute("id")));

for (let i = 0; i < 25; i++) {
    await autoScroll(page);
    await page.waitFor(3000);
    let id = companies[i];
    await page.focus("#" + accountsClickElements[0]);
    accountsClickElements = await page.$$eval(".result-lockup__name a", el => el.map(x => x.getAttribute("id")));
    await page.waitFor(3000);
    await page.focus("#" + accountsClickElements[i]);
    await page.click("#" + accountsClickElements[i]);
    await page.waitFor(2000);
    console.log("#" + companies[i].linid);
    await page.goBack();
}

这是一段代码,它在同一个标签页中打开链接,但过一段时间后,它不会处理所有25个元素,因为ID每次打开页面都会更改,我会收到错误。

我已经将代码更改如下,所以不再点击元素,而是点击 href 属性。target _blank 属性存在,但它仍然在同一个标签页中打开。您能指出原因吗?

await page.$$eval('.result-lockup__name a', el => el.map(x => x.setAttribute("target", "_blank")));
let accountsClickElements = await page.$$eval(".result-lockup__name a", el => el.map(x => x.getAttribute("href")));

for (let i = 0; i < 25; i++) {
await page.waitFor(2000);
await autoScroll(page);
await page.waitFor(2000);
await page.click('a[href="' + accountsClickElements[i] + '"]');
}


希望这可以帮助你解决问题。

<details>
<summary>英文:</summary>

I have got a list of 25 elements that are clickable. I need to open each one of them in a new tab, scrape the new page opened in a new tab, then close it. Then go to the next element and do the same for each element in the list.

however, I am having problems opening the links in a new tab by clicking on them. I managed to open then with `page.goto(&#39;link&quot;)` but I want to make it more humanized and instead of pasting the link into the new tab I want to be opened by clicking.



    let accountsClickElements = await page.$$eval(&quot;.result-lockup__name a&quot;, el =&gt; el.map(x =&gt; x.getAttribute(&quot;id&quot;)));
                                                                                            
     for (let i = 0; i&lt;25; i++) {
         await autoScroll(page);
         await page.waitFor(3000);
         let id = companies[i];
         await page.focus(&quot;#&quot;+accountsClickElements[0]);
         accountsClickElements = await page.$$eval(&quot;.result-lockup__name a&quot;, el =&gt; el.map(x =&gt; x.getAttribute(&quot;id&quot;)));
         await page.waitFor(3000);
         await page.focus(&quot;#&quot;+accountsClickElements[i]);                             
         await page.click(&quot;#&quot;+accountsClickElements[i]);
         await page.waitFor(2000);
         console.log(&quot;#&quot;+companies[i].linid);
         await page.goBack(); 
      }                              `

This is a code that opens the links in the same tab, but after a while, it doesn&#39;t take all 25 elements and since id is changing every time I open the page I get an error.

I have changed the code like this so instead of clicking of the element, it is clicking on the href attribute. the target _blank attribute is there, but it still opening on the same tab. Can you point why?

    await page.$$eval(&#39;.result-lockup__name a&#39;, el =&gt; el.map(x =&gt; x.setAttribute(&quot;target&quot;, &quot;_blank&quot;)));
    let accountsClickElements = await page.$$eval(&quot;.result-lockup__name a&quot;, el =&gt; el.map(x =&gt; x.getAttribute(&quot;href&quot;)));                                                                                                                  
    for (let i = 0; i&lt;25; i++) {
       await page.waitFor(2000);
       await autoScroll(page);
       await page.waitFor(2000);    
       await page.click(&#39;a[href=&quot;&#39;+accountsClickElements[i]+&#39;&quot;]&#39;);
    }

</details>


# 答案1
**得分**: 7

使用[`page.click`][1]进行中键点击:

```javascript
let options = {button: 'middle'};
await page.click('a[href="' + accountsClickElements[i] + '"]', options)

新标签页可能需要一些时间才能打开,您可以使用await page.waitForTimeout(1000)等待。

完整代码:

let accountsClickElements = await page.$$eval('.result-lockup__name a', el => el.map(x => x.getAttribute('href')));

browser.on('targetcreated', function(){                                            
    console.log(accountsClickElements[i]) 
})   

let options = {
    button: 'middle'
};
for (let i = 0; i < 25; i++) {
    await page.waitForTimeout(2000);
    await page.focus('a[href="' + accountsClickElements[i] + '"]')                                                    
    await page.click('a[href="' + accountsClickElements[i] + '"]', options)
    const [tab1, tab2, tab3] = await browser.pages();
    await page.waitForTimeout(2000);
    await tab3.bringToFront();                                                                               
    await ListenCompanyPageNewTab(tab3); 
    await tab2.bringToFront();                                                            
}

<details>
<summary>英文:</summary>

Middle click using [`page.click`][1]:

    let options = {button : &#39;middle&#39;};
    await page.click(&#39;a[href=&quot;&#39;+accountsClickElements[i]+&#39;&quot;]&#39;, options)

It might take some time for the new tab to open, which you can wait for with `await page.waitForTimeout(1000)`.

Full code:

    let accountsClickElements = await page.$$eval(&#39;.result-lockup__name a&#39;, el =&gt; el.map(x =&gt; x.getAttribute(&#39;href&#39;)));

    browser.on(&#39;targetcreated&#39;, function(){                                            
        console.log(accountsClickElements[i]) 
    })   

    let options = {
        button : &#39;middle&#39;
    };
    for (let i = 0; i&lt;25; i++) {
        await page.waitForTimeout(2000);
        await page.focus(&#39;a[href=&quot;&#39;+accountsClickElements[i]+&#39;&quot;]&#39;)                                                    
        await page.click(&#39;a[href=&quot;&#39;+accountsClickElements[i]+&#39;&quot;]&#39;, options)
        const [tab1, tab2, tab3] = await browser.pages();
        await page.waitForTimeout(2000);
        await tab3.bringToFront();                                                                               
        await ListenCompanyPageNewTab(tab3); 
        await tab2.bringToFront();                                                            
    } 


  [1]: https://pptr.dev/#?show=api-pageclickselector-options

</details>



# 答案2
**得分**: 6

在**Windows**或**Linux**上,在点击之前,[按住Ctrl键][1]:

```js
    // Ctrl+click to open in new tab
    await page.keyboard.down('Control');
    await page.click('a[href="'+accountsClickElements[i]+'"]')
    await page.keyboard.up('Control');

然后找到已打开的选项卡。一种方法是使用await browser.pages()并过滤掉当前页面。请注意,当您启动Puppeteer时,它已经打开了一个选项卡,所以如果您首先执行的是const page = await browser.newPage();,那么您可能有两个选项卡,需要关闭第一个选项卡。

以下是更完整的代码示例:

import puppeteer from 'puppeteer';

(async () => {
    const browser = await puppeteer.launch();
    // 当浏览器启动时,应该已经有一个about:blank标签页打开。
    const page = (await browser.pages())[0];

    // 更安全的方法是打开一个新标签页,然后关闭除了刚刚打开的标签页之外的所有标签页。
    // const page = await browser.newPage();
    // // 关闭除了我们在上面一行创建的标签页之外的所有标签页
    // for (const p of (await browser.pages())) {
    //     if (p !== page) {
    //         await p.close()
    //     }
    // }

    await page.goto('https://example.com/');

    // 使用Ctrl+点击“More information...”链接在新标签页中打开它
    await page.keyboard.down('Control');
    await page.click('a');
    await page.keyboard.up('Control');

    // 等待一秒钟以打开标签页
    await page.waitForTimeout(1000);

    // 打印当前打开的所有标签页的URL
    console.error((await browser.pages()).map(p => p.url()));

    if ((await browser.pages()).length !== 2) {
        throw "unexpected number of tabs";
    }
    const otherPage = (await browser.pages())[1];
    // 对另一个标签页执行操作
    // ...
    // 然后关闭它
    await otherPage.close();

    await browser.close();
})();

macOS上,这不起作用。快捷键是Command+点击,所以您需要使用await page.keyboard.down('Meta')而不是'Control',但这不起作用,因为在macOS上,使用Command键的键盘快捷键通常由操作系统处理,而不是浏览器(像⌘-A这样的东西,通常选择所有文本,也不起作用)。您可以尝试:

  • 中键点击,如@nightowl建议的

  • Shift+点击,以在新窗口中打开链接。使用browser.browserContexts()来查找新窗口

  • 在Linux下使用虚拟机或Docker容器中运行您的代码

英文:

On Windows or Linux, hold down the Ctrl key before you click:

    // Ctrl+click to open in new tab
    await page.keyboard.down(&#39;Control&#39;);
    await page.click(&#39;a[href=&quot;&#39;+accountsClickElements[i]+&#39;&quot;]&#39;)
    await page.keyboard.up(&#39;Control&#39;);

Then find the tab that was opened. One way to do that is with await browser.pages() and filtering out the current page. Note that when you start Puppeteer it already has one tab open, so if the first thing you did was const page = await browser.newPage(); you probably have two tabs and need to .close() the first one.

Here's a more complete code sample:

import puppeteer from &#39;puppeteer&#39;;

(async () =&gt; {
    const browser = await puppeteer.launch();
    // When the browser launches, it should have one about:blank tab open.
    const page = (await browser.pages())[0];

    // A safer way to do the above is to open a new tab and then close all
    // tabs that aren&#39;t the tab you just opened.
    // const page = await browser.newPage();
    // // Close any tabs that aren&#39;t the one we created on the line above
    // for (const p of (await browser.pages())) {
    //     if (p !== page) {
    //         await p.close()
    //     }
    // }

    await page.goto(&#39;https://example.com/&#39;);

    // Ctrl-click the &quot;More information...&quot; link to open it in a new tab
    await page.keyboard.down(&#39;Control&#39;);
    await page.click(&#39;a&#39;);
    await page.keyboard.up(&#39;Control&#39;);

    // Wait a second for the tab to open
    await page.waitForTimeout(1000);

    // Print the URLs of all currently open tabs
    console.error((await browser.pages()).map(p =&gt; p.url()));

    if ((await browser.pages()).length !== 2) {
        throw &quot;unexpected number of tabs&quot;;
    }
    const otherPage = (await browser.pages())[1];
    // Do something with the other tab
    // ...
    // Then close it
    await otherPage.close();

    await browser.close();
})();

On macOS this won't work. The shortcut is Command+click so you would have to do await page.keyboard.down(&#39;Meta&#39;) instead of &#39;Control&#39;, but that doesn't work because on macOS keyboard shortcuts that use the Command key are usually handled by the operating system, not the browser (things like ⌘-A, which normally selects all text, don't work either). You can try

答案3

得分: 3

target=&quot;_blank&quot;添加到您想点击的元素中:

await page.$$eval('.result-lockup__name a', el => el.map(x => x.setAttribute("target", "_blank")));
英文:

add target=&quot;_blank&quot; to the elements you would like to click:

await page.$$eval(&#39;.result-lockup__name a&#39;, el =&gt; el.map(x =&gt; x.setAttribute(&quot;target&quot;, &quot;_blank&quot;)));

答案4

得分: 0

你可以先在浏览器实例上创建一个页面,然后跳转到该页面。

const page = await browser.newPage();

await page.goto('https://url.com');
英文:

You can create a page first on the browser instance and then goto that page.

const page = await browser.newPage();

await page.goto('https://url.com');

答案5

得分: 0

以下是您要翻译的内容:

import puppeteer from 'puppeteer'

(async () => {
    try {
        // Goal function to wait new page and get instance
        async function getNewBrowserTab(browser) {
            let resultPromise

            async function onTargetcreatedHandler(target) {
                if (target.type() === 'page') {
                    const newPage = await target.page()
                    const newPagePromise = new Promise(y =>
                        newPage.once('domcontentloaded', () => y(newPage))
                    )

                    const isPageLoaded = await newPage.evaluate(
                        () => document.readyState
                    )

                    browser.off('targetcreated', onTargetcreatedHandler) // unsubscribing

                    return isPageLoaded.match('complete|interactive')
                        ? resultPromise(newPage)
                        : resultPromise(newPagePromise)
                }
            }

            return new Promise(resolve => {
                resultPromise = resolve
                browser.on('targetcreated', onTargetcreatedHandler)
            })
        }

        // Using
        const browser = await puppeteer.launch({ headless: false })
        const page = await browser.newPage()

        await page.goto('https://www.google.com/')

        // Click on link with middle button to open in new browser tab
        await page.click('a[href]', { button: 'middle' })

        // Wait for new tab and return a page instance
        const newPage = await getNewBrowserTab(browser)

        // Switch to new tab
        await newPage.bringToFront()

        // Wait a bit to see the page
        await new Promise(resolve => setTimeout(resolve, 1000))

        await newPage.close()
        await page.close()
        await browser.close()
    } catch (e) {
        console.error(e)
    }
})()

基于 puppeteer GitHub 的回答

英文:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-js -->

import puppeteer from &#39;puppeteer&#39;
(async () =&gt; {
try {
// Goal function to wait new page and get instance
async function getNewBrowserTab(browser) {
let resultPromise
async function onTargetcreatedHandler(target) {
if (target.type() === &#39;page&#39;) {
const newPage = await target.page()
const newPagePromise = new Promise(y =&gt;
newPage.once(&#39;domcontentloaded&#39;, () =&gt; y(newPage))
)
const isPageLoaded = await newPage.evaluate(
() =&gt; document.readyState
)
browser.off(&#39;targetcreated&#39;, onTargetcreatedHandler) // unsubscribing
return isPageLoaded.match(&#39;complete|interactive&#39;)
? resultPromise(newPage)
: resultPromise(newPagePromise)
}
}
return new Promise(resolve =&gt; {
resultPromise = resolve
browser.on(&#39;targetcreated&#39;, onTargetcreatedHandler)
})
}
// Using
const browser = await puppeteer.launch({ headless: false })
const page = await browser.newPage()
await page.goto(&#39;https://www.google.com/&#39;)
// Click on link with middle button to open in new browser tab
await page.click(&#39;a[href]&#39;, { button : &#39;middle&#39; })
// Wait for new tab and return a page instance
const newPage = await getNewBrowserTab(browser)
// Switch to new tab
await newPage.bringToFront()
// Wait a bit to see the page
await new Promise(resolve =&gt; setTimeout(resolve, 1000))
await newPage.close()
await page.close()
await browser.close()
} catch (e) {
console.error(e)
}
})()

<!-- end snippet -->

Answer based on puppeteer github

huangapple
  • 本文由 发表于 2020年1月3日 17:09:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/59575748.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定