2023年6月13日 18:58:42go评论94阅读模式

英文:

SyntaxError using WebDriverWait to click button with Selenium Python

问题

以下是您要求的代码的翻译部分：

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd

data = []

for y in range(1, 3):
    website = f'https://www.knowde.com/b/markets-personal-care/products{y}'
    path = '/Users/kdavid3mbp/Python/chrome_driver64/chromedriver'
    driver = webdriver.Chrome(path)
    driver.get(website)

    for x in range(1, 37):
        products = driver.find_elements('xpath', f'//*[@id="__next"]/main/div/div[3]/div[3]/div[1]/div[2]/div[{x}]')

        for product in products:
            WebDriverWait(driver, 10).until(EC.element_to_be_clickable(('xpath', '/div/div/svg')).click()

            brand = product.find_element('xpath', './a/div[2]/div/p[1]').text
            item = product.find_element('xpath', './a/div[2]/div/p[2]').text
            inci_name = product.find_element('xpath', './a/div[2]/div/div[1]/span[2]').text
            try:
                ingredient_origin = product.find_element('xpath', './a/div[2]/div/div[3]/span[2]').text
            except NoSuchElementException:
                ingredient_origin = 'null'
            try:
                function = product.find_element('xpath', './a/div[2]/div/div[2]/span[2]').text
            except NoSuchElementException:
                function = 'null'
            try:
                benefit_claims = product.find_element('xpath', './a/div[2]/div/div[4]/span[2]').text
            except NoSuchElementException:
                benefit_claims = 'null'
            try:
                description = product.find_element('xpath', './a/div[2]/div/p[3]').text
            except NoSuchElementException:
                description = 'null'
            try:
                labeling_claims = product.find_element('xpath', './a/div[2]/div/div[5]/span[2]').text
            except NoSuchElementException:
                labeling_claims = 'null'
            try:
                compliance = product.find_element('xpath', './a/div[2]/div/div[6]/span[2]').text
            except NoSuchElementException:
                compliance = 'null'
            try:
                hlb_value = product.find_element('xpath', './a/div[2]/div/div[4]/span[2]').text
            except NoSuchElementException:
                hlb_value = 'null'
            try:
                end_uses = product.find_element('xpath', '/a/div[2]/div/div[4]/span[2]').text
            except NoSuchElementException:
                end_uses = 'null'
            try:
                cas_no = product.find_element('xpath', './a/div[2]/div/div[5]/span[2]').text
            except NoSuchElementException:
                cas_no = 'null'
            try:
                chemical_name = product.find_element('xpath', './a/div[2]/div/div[2]/span[2]').text
            except NoSuchElementException:
                chemical_name = 'null'
            try:
                synonyms = product.find_element('xpath', './a/div[2]/div/div[6]/span[2]').text
            except NoSuchElementException:
                synonyms = 'null'
            try:
                chemical_family = product.find_element('xpath', './a/div[2]/div/div[5]/span[2]').text
            except NoSuchElementException:
                chemical_family = 'null'
            try:
                features = product.find_element('xpath', './a/div[2]/div/div[7]/span[2]').text
            except NoSuchElementException:
                features = 'null'
            try:
                grade = product.find_element('xpath', './a/div[2]/div/div[5]/span[2]').text
            except NoSuchElementException:
                grade = 'null'

            dict = {
                'brand': brand,
                'item': item,
                'inci_name': inci_name,
                'ingredient_origin': ingredient_origin,
                'function': function,
                'benefit_claims': benefit_claims,
                'description': description,
                'labeling_claims': labeling_claims,
                'compliance': compliance,
                'hlb_value': hlb_value,
                'end_uses': end_uses,
                'cas_no': cas_no,
                'chemical_name': chemical_name,
                'synonyms': synonyms,
                'chemical_family': chemical_family,
                'features': features,
                'grade': grade
            }

            data.append(dict)
            print('Saving:', dict['brand'])

# Closes driver once for loop is completed
driver.quit()

df = pd.DataFrame(data)
df.to_csv('/Users/kdavid3mbp/Python/cosmetics_data.csv', index=False)

希望这有助于您的问题。

英文:

Current problem: The code below runs fine until I insert the following code to click an arrow on the product square/profile.

Bigger problem: The code as a whole runs fine, but the dataset is distorted. After some experimenting, I discovered that the distorted data is all located "below the fold." I'm trying to click on the error on each product square/profile in order to expose the otherwise hidden data. I believe if I can do this, the scraper should work and the dataset will no longer be distorted.

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
data = []
for y in range(1,3):
website = f&#39;https://www.knowde.com/b/markets-personal-care/products{y}&#39;
path = &#39;/Users/kdavid3mbp/Python/chrome_driver64/chromedriver&#39;
driver = webdriver.Chrome(path)
driver.get(website)
for x in range(1,37):
products = driver.find_elements(&#39;xpath&#39;, f&#39;//*[@id=&quot;__next&quot;]/main/div/div[3]/div[3]/div[1]/div[2]/div[{x}]&#39;)
for product in products:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((&#39;xpath&#39;, &#39;/div/div/svg&#39;)).click()
brand = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/p[1]&#39;).text
item = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/p[2]&#39;).text
inci_name = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[1]/span[2]&#39;).text
try:
ingredient_origin = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[3]/span[2]&#39;).text
except NoSuchElementException:
ingredient_origin = &#39;null&#39;
try:
function = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[2]/span[2]&#39;).text
except NoSuchElementException:
function = &#39;null&#39;
try:
benefit_claims = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[4]/span[2]&#39;).text
except NoSuchElementException:
benefit_claims = &#39;null&#39;
try:
description = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/p[3]&#39;).text
except NoSuchElementException:
description = &#39;null&#39;
try:
labeling_claims = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[5]/span[2]&#39;).text
except NoSuchElementException:
labeling_claims = &#39;null&#39;
try:
compliance = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[6]/span[2]&#39;).text
except NoSuchElementException:
compliance = &#39;null&#39;
try:
hlb_value = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[4]/span[2]&#39;).text
except NoSuchElementException:
hlb_value = &#39;null&#39;
try:
end_uses = product.find_element(&#39;xpath&#39;, &#39;/a/div[2]/div/div[4]/span[2]&#39;).text
except NoSuchElementException:
end_uses = &#39;null&#39;
try:
cas_no = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[5]/span[2]&#39;).text
except NoSuchElementException:
cas_no = &#39;null&#39;
try:
chemical_name = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[2]/span[2]&#39;).text
except NoSuchElementException:
chemical_name = &#39;null&#39;
try:
synonyms = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[6]/span[2]&#39;).text
except NoSuchElementException:
synonyms = &#39;null&#39;
try:
chemical_family = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[5]/span[2]&#39;).text
except NoSuchElementException:
chemical_family = &#39;null&#39;
try:
features = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[7]/span[2]&#39;).text
except NoSuchElementException:
features = &#39;null&#39;
try:
grade = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/div[5]/span[2]&#39;).text
except NoSuchElementException:
grade = &#39;null&#39;
dict = {
&#39;brand&#39;: brand,
&#39;item&#39;: item,
&#39;inci_name&#39;: inci_name,
&#39;ingredient_origin&#39;: ingredient_origin,
&#39;function&#39;: function,
&#39;benefit_claims&#39;: benefit_claims,
&#39;description&#39;: description,
&#39;labeling_claims&#39;: labeling_claims,
&#39;compliance&#39;: compliance,
&#39;hlb_value&#39;: hlb_value,
&#39;end_uses&#39;: end_uses,
&#39;cas_no&#39;: cas_no,
&#39;chemical_name&#39;: chemical_name,
&#39;synonyms&#39;: synonyms,
&#39;chemical_family&#39;: chemical_family,
&#39;features&#39;: features,
&#39;grade&#39;: grade
}
data.append(dict)
print(&#39;Saving: &#39;, dict[&#39;brand&#39;])
# Closes driver once for loop is completed
driver.quit()
df = pd.DataFrame(data)
df.to_csv(&#39;/Users/kdavid3mbp/Python/cosmetics_data.csv&#39;, index=False)

The current problem is when inserting the following:

WebDriverWait(driver, 10).until(EC.element_to_be_clickable((&#39;xpath&#39;, &#39;/div/div/svg&#39;)).click()

I get a SyntaxError:

  File &quot;/var/folders/90/82_f843n4h9drvxh7z3tqg840000gn/T/ipykernel_34523/1099992952.py&quot;, line 22
brand = product.find_element(&#39;xpath&#39;, &#39;./a/div[2]/div/p[1]&#39;).text
^
SyntaxError: invalid syntax

I'm not sure how to arrange this so that I click the down arrows. I'd like to click each one for the 36 product squares/profiles on each page.

答案1

得分: 1

根据定义，element_to_be_clickable()应该在一个tuple内调用，因为它不是一个_函数_而是一个_类_，其初始化程序除了_隐式_的self之外，只需要1个参数：

class element_to_be_clickable(object):
    &quot;&quot;&quot; 用于检查元素是否可见且已启用以进行点击的期望条件。&quot;&quot;&quot;
    def __init__(self, locator):
        self.locator = locator

    def __call__(self, driver):
        element = visibility_of_element_located(self.locator)(driver)
        if element and element.is_enabled():
            return element
        else:
            return False

所以，不是这样写：

WebDriverWait(driver, 10).until(EC.element_to_be_clickable(('xpath', '/div/div/svg')).click()

而是需要添加一个额外的括号：

WebDriverWait(driver, 10).until(EC.element_to_be_clickable(('xpath', '/div/div/svg'))).click()
                                                # 注意额外的括号 ^

参考资料：

英文:

According to the definition, element_to_be_clickable() should be called within a tuple as it is not a function but a class, where the initializer expects just 1 argument beyond the implicit self:

class element_to_be_clickable(object):
&quot;&quot;&quot; An Expectation for checking an element is visible and enabled such that you can click it.&quot;&quot;&quot;
def __init__(self, locator):
self.locator = locator
def __call__(self, driver):
element = visibility_of_element_located(self.locator)(driver)
if element and element.is_enabled():
return element
else:
return False

So instead of:

WebDriverWait(driver, 10).until(EC.element_to_be_clickable((&#39;xpath&#39;, &#39;/div/div/svg&#39;)).click()

You need to (add an extra parentheses):

WebDriverWait(driver, 10).until(EC.element_to_be_clickable((&#39;xpath&#39;, &#39;/div/div/svg&#39;))).click()
# note the additional end of parenthesis ^

References

You can find a couple of relevant detailed discussions in:

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

SyntaxError使用Selenium Python中的WebDriverWait点击按钮。

问题

答案1

References

re.sub一个单词列表，忽略大小写

将包名和函数名作为变量传递给Python

Python openpyxl字体TypeError

Python creates corrupt files of excel when I run my code. How can I save the files without corrupting the file?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论