你可以使用Selenium在Python中打印出网页表格列中的所有文本。

huangapple go评论63阅读模式
英文:

How can I print out all text in a web table column in python using selenium?

问题

我正在尝试在Python中使用for循环来打印网页表格列中的文本,使用列中所有单元格的XPath。XPath类似于以下内容:

//*[@id="webTable"]/tbody/tr[2]/td[6]

我使用的for循环如下所示:

for x in range(totalRows):
    y = driver.find_elements(by=By.XPATH, value='//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]')
    print(y)

然而,当我运行程序时,我得到以下输出:

[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="c488195e-8751-43c8-9d01-6e873cb2cc4a")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="70f9ad39-4bdd-4bcf-b869-c31968de4492")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="f8fd427e-2bd3-4995-8b24-7cb7bda14f1a")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="0541eb71-24a1-44e9-bb9d-bacc63426bad")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="b19a839e-a6c1-43f2-bcf1-1f0692ff2c0f")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="b427383a-31a5-49f8-a466-62fb5a489047")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="1cd4bd3f-6e7f-4a89-950e-0f5dab47eabd")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="5c964e47-2fff-4c4d-9743-eecbd1c7bea6")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="54ff1ef7-0693-43e2-939e-c387f8f20e06")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="21a63bd7-7dc5-4860-bfb2-1309a842c2f7")>]
[<selenium.webdriver.remote.webelement WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="aee78709-f4ee-4e0f-8cb7-6c3114b52fba")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="28ef515e-4c66-472b-8126-76793eeebee2")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="2fb995ff-9100-4124-9efe-f8c2bfe49767")>]

我尝试将for循环写成以下两种方式:

for x in range(totalRows):
    y = driver.find_elements(by=By.XPATH, value='//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]')
    print(y.text)

和:

for x in range(totalRows):
    y = driver.find_elements(by=By.XPATH, value='//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]').text
    print(y)

但当我以这种方式编写时,我收到以下错误消息:

AttributeError: 'list' object has no attribute 'text'

我不知道如何提取单元格内的文本,所以任何帮助将不胜感激!

英文:

I am attempting to use a for-loop in python to print out the text in a web table column using the xpath of all cells in the column. The xpath is similar to this:

//*[@id="webTable"]/tbody/tr[2]/td[6]

The for-loop I am using is written like this:

for x in range(totalRows):
y = driver.find_elements(by = By.XPATH, value = '//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]')
print(y)

However, when I run the program, this is the output that I get:

[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="c488195e-8751-43c8-9d01-6e873cb2cc4a")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="70f9ad39-4bdd-4bcf-b869-c31968de4492")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="f8fd427e-2bd3-4995-8b24-7cb7bda14f1a")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="0541eb71-24a1-44e9-bb9d-bacc63426bad")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="b19a839e-a6c1-43f2-bcf1-1f0692ff2c0f")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="b427383a-31a5-49f8-a466-62fb5a489047")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="1cd4bd3f-6e7f-4a89-950e-0f5dab47eabd")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="5c964e47-2fff-4c4d-9743-eecbd1c7bea6")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="54ff1ef7-0693-43e2-939e-c387f8f20e06")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="21a63bd7-7dc5-4860-bfb2-1309a842c2f7")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="aee78709-f4ee-4e0f-8cb7-6c3114b52fba")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="28ef515e-4c66-472b-8126-76793eeebee2")>]
[<selenium.webdriver.remote.webelement.WebElement (session="e8afe17e1e80e6c09dd2656800326654", element="2fb995ff-9100-4124-9efe-f8c2bfe49767")>]

I tried writing the for-loop like this:

for x in range(totalRows):
y = driver.find_elements(by = By.XPATH, value = '//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]')
print(y.text)

and:

for x in range(totalRows):
y = driver.find_elements(by = By.XPATH, value = '//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]').text
print(y)

but when I write it like that, I receive this error:

AttributeError: 'list' object has no attribute 'text'

I don't know how else to extract the text within the cells, so any help would be appreciated!

答案1

得分: 1

以下是翻译好的代码部分:

这是解决方案

    table = driver.find_element(by=By.XPATH, value='//*[@id="webTable"]/tbody')
    rows = table.find_elements(by=By.TAG_NAME, value="tr")

    # 选择要提取的列,例如表中的第二列
    desired_column = 1 
    desired_column_data = []
    
    for row in rows:
        columns = row.find_elements(by=By.TAG_NAME, value='td')

        for index, col in enumerate(columns):
            if index == desired_column:
                desired_column_data.append(col.text)

    print(desired_column_data)

希望能帮助到你 :)
英文:

Here is the solution:

table = driver.find_element(by=By.XPATH, value='//*[@id="webTable"]/tbody')
rows = table.find_elements(by=By.TAG_NAME, value="tr")
# column to choose by its index, say 2nd column in the table
desired_column = 1 
desired_column_data = []
for row in rows:
columns = row.find_elements(by=By.TAG_NAME, value='td')
for index, col in enumerate(columns):
if index == desired_column:
desired_column_data.append(col.text)
print(desired_column_data)

Hope, it helps 你可以使用Selenium在Python中打印出网页表格列中的所有文本。

答案2

得分: 1

driver.findElements 返回 WebElements 的列表。因此,当尝试获取列表对象的文本值时,您会收到预期的错误。而在迭代 totalRows 时,您应该使用 driver.findElement

for x in range(totalRows):
    y = driver.find_element(by = By.XPATH, value = '//[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]').text
    print(y)
英文:

driver.findElements returns List of WebElements. Hence you are getting expected error while trying to get text value of List object. Instead you should be using driver.findElement in your logic while iterating through totalRows.

for x in range(totalRows):
y = driver.find_element(by = By.XPATH, value = '//*[@id="webTable"]/tbody/tr[' + str(x) + ']/td[6]').text
print(y)

huangapple
  • 本文由 发表于 2023年2月24日 10:09:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75552030.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定