2020年9月9日 21:34:09go评论84阅读模式

英文:

getting invaild type of elements IN SELENIUM

问题

在Linux Ubuntu 20.04上使用IntelliJ IDEA最新的社区版本与Firefox和GeckoDriver一起工作。

我正在尝试从网页中获取一些时间表，并将它们复制到一个.txt文件（或一个列表，无所谓）中。

我正在尝试这样做：

WebDriver driver = new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(2000, TimeUnit.MILLISECONDS); //最大等待时间
driver.get("http://telematics.oasa.gr/#main");
driver.findElement(By.xpath("//option[contains(.,'021')]")).click();//选择行程

List<WebElement> oas = driver.findElements(By.xpath("//div/ul/li"));
System.out.println(oas.size());
System.out.println(oas);

页面链接：http://telematics.oasa.gr/#lineDetails_1151_021%20:%20%CE%A0%CE%9B%CE%91%CE%A4%CE%95%CE%99%CE%91%20%CE%9A%CE%91%CE%9D%CE%99%CE%93%CE%93%CE%9F%CE%A3%20-%20%CE%93%CE%9A%CE%A5%CE%96H%20(%CE%9A%CE%A5%CE%9A%CE%9B%CE%99%CE%9A%CE%97)_9-86

以下是页面的HTML片段：

<li class="list-group-item scheduleEntryL"><button type="button" class="btn btn-info btn-circle" style="cursor:default;">07</button>&nbsp;&nbsp;&nbsp;07:10 &nbsp;&nbsp;&nbsp; 07:25 &nbsp;&nbsp;&nbsp; 07:40 &nbsp;&nbsp;&nbsp; 07:55 &nbsp;&nbsp;&nbsp; </li>

在这之后的输出是：

19        
[[[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -> xpath: //div/ul/li], ...]

这意味着我的列表有19个元素，但这不是我想要的。

总结：

我得到的元素类型不正确。

列表应该包含：

[...,07:00,07:10,07:25,...]

它应该包含59个元素，因为页面上提供了59个出发时间，但其中一些在同一行。

页面有19行，所以它可能将每一行都作为一个元素提供，这也不是我想要的。

请帮忙解决。

备注：
我已经在这个页面上检查了类似的帖子，但没有帮助。

英文:

working on Linux ubuntu 20.04 with intellij IDEA latest community version with firefox and geckodriver

I am trying to get some timetables from a webpage and copy them to a .txt file (or a list doesn't matter)

I am trying this :

WebDriver driver = new FirefoxDriver();
    driver.manage().timeouts().implicitlyWait(2000, TimeUnit.MILLISECONDS); //MAXIMUM WAIT TIME
    driver.get(&quot;http://telematics.oasa.gr/#main&quot;);
    driver.findElement(By.xpath(&quot;//option[contains(.,&#39;021&#39;)]&quot;)).click();//selecting trip

    List&lt;WebElement&gt; oas = driver.findElements(By.xpath(&quot;//div/ul/li&quot;));
    System.out.println(oas.size());
    System.out.println(oas);

page link : http://telematics.oasa.gr/#lineDetails_1151_021%20:%20%CE%A0%CE%9B%CE%91%CE%A4%CE%95%CE%99%CE%91%20%CE%9A%CE%91%CE%9D%CE%99%CE%93%CE%93%CE%9F%CE%A3%20-%20%CE%93%CE%9A%CE%A5%CE%96H%20(%CE%9A%CE%A5%CE%9A%CE%9B%CE%99%CE%9A%CE%97)_9-86

here is the html of the page :

    &lt;li class=&quot;list-group-item scheduleEntryL&quot;&gt;&lt;button type=&quot;button&quot; class=&quot;btn btn-info btn-circle&quot; style=&quot;cursor:default;&quot;&gt;07&lt;/button&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;07:10 &amp;nbsp;&amp;nbsp;&amp;nbsp; 07:25 &amp;nbsp;&amp;nbsp;&amp;nbsp; 07:40 &amp;nbsp;&amp;nbsp;&amp;nbsp; 07:55 &amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/li&gt;

and after this the output is :

    19        

    [[[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li], [[FirefoxDriver: firefox on LINUX (115ffdb6-1eb7-44c4-bebd-dee885674bab)] -&gt; xpath: //div/ul/li]]

which means my list has 19 elements but its not what i want

//SUMMARY

The type of the elements i get is not right

the list should contain:

[...,07:00,07:10,07:25,....]

2.It should contain 59 elemnts because there are 59 departures given in the page but some of them are in the same line

the page has 19 lines so it is propably giving every line as ONE element and this is also not what i want

PLEASE HELP

//I HAVE CHECKED SIMMILAR POSTS ON THIS PAGE AND DID NOT HELP

答案1

得分: 1

你正在使用XPath，它会返回页面中的行，但不会返回包含所需值的实际元素。此外，文本值位于两个节点之间，因此我们需要使用JavaScript来处理。

// 提取不带 * 的出发时间的代码
List<WebElement> oas = driver.findElements(By.xpath("//li[@class='list-group-item scheduleEntryL']"));

LinkedList<String> timeValues = new LinkedList<String>();
String myCountryProxy = null;

for(WebElement element : oas) {
    myCountryProxy = ((JavascriptExecutor) driver).executeScript("return arguments[0].childNodes[1].textContent;", element).toString();
    
    if (!myCountryProxy.equalsIgnoreCase("   ")) {
        myCountryProxy = myCountryProxy.replaceAll("   ", " ").replaceAll("   ", "").replaceAll("   ", "");
        myCountryProxy = myCountryProxy.trim();
        String[] split = myCountryProxy.split("\\s+");
        for (String str : split) {
            timeValues.add(str);
        }
    }
}

// 提取带 * 的出发时间值
oas = driver.findElements(By.xpath("//span[@class='xtra']"));
for (WebElement element : oas) {
    myCountryProxy = ((JavascriptExecutor) driver).executeScript("return arguments[0].childNodes[0].textContent;", element).toString();
    timeValues.add(myCountryProxy);
}

// 提取与 * 相同行的出发值，即：05:10     05:30     05:50
oas = driver.findElements(By.xpath("//li[@class='list-group-item scheduleEntryL']/span"));
if (oas.size() > 0) {
    oas = driver.findElements(By.xpath("//li[@class='list-group-item scheduleEntryL']/span/.."));
    for (WebElement element : oas) {
        myCountryProxy = ((JavascriptExecutor) driver).executeScript("return arguments[0].childNodes[3].textContent;", element).toString();
        
        if (!myCountryProxy.equalsIgnoreCase("   ")) {
            myCountryProxy = myCountryProxy.replaceAll("   ", " ").replaceAll("   ", "").replaceAll("   ", "");
            myCountryProxy = myCountryProxy.trim();
            String[] split = myCountryProxy.split("\\s+");
            for (String str : split) {
                timeValues.add(str);
            }
        }
    }
}

System.out.println(timeValues);

请注意，我已将具有 * 的时间值从列表的末尾添加进去。要提取节点之间的文本，我参考了此 Stack Overflow 帖子。

此外，如果您想选择下拉列表的值，请使用 Selenium 的 Select 类：

Select select = new Select(driver.findElement(By.id("lineSelect")));
select.selectByValue("1151_54_9"); // 选择下拉列表值: 021 : ΠΛΑΤΕΙΑ ΚΑΝΙΓΓΟΣ - ΓΚΥΖH (ΚΥΚΛΙΚΗ)

英文:

Your're using xpath which will return rows from page but not actual elements which contains required values. Also text values are between two nodes, so we need to use JavaScript for it.

        //Code to extract Departure time without *
List&lt;WebElement&gt; oas = driver.findElements(By.xpath(&quot;//li[@class=&#39;list-group-item scheduleEntryL&#39;]&quot;));
LinkedList&lt;String&gt; timeValues = new LinkedList&lt;String&gt;();
String myCountryProxy = null;
for(WebElement element:oas)
{
//As text is between two nodes, we need to use javaScript as selenium getText() method didn&#39;t work for it
//So in below javaScript, we refer to oas as parent element as then we try to find node which contain text as childNode. In this code its 2nd child node so we have passed value as 1: childNodes[1]
myCountryProxy = ((JavascriptExecutor)driver).executeScript(&quot;return arguments[0].childNodes[1].textContent;&quot;, element).toString();
//Code to remove extra space from String
if(!myCountryProxy.equalsIgnoreCase(&quot;&#160;&#160;&#160;&quot;))
{					
myCountryProxy = myCountryProxy.replaceAll(&quot; &#160;&#160;&#160; &quot;,&quot; &quot;).replaceAll(&quot;&#160;&#160;&#160;&quot;,&quot;&quot;).replaceAll(&quot; &#160;&#160;&#160; &quot;,&quot;&quot;);
myCountryProxy = myCountryProxy.trim();
//Split string into individual value
String[] split = myCountryProxy.split(&quot;\\s+&quot;);
for(String str:split)
{
timeValues.add(str);
}					
}				
}
//Code to extract departure time values with *: 05:00*
oas = driver.findElements(By.xpath(&quot;//span[@class=&#39;xtra&#39;]&quot;));
for(WebElement element:oas)
{
//here node which contains text is 1st child node only, so we have passed value as childNodes[0]
myCountryProxy = ((JavascriptExecutor)driver).executeScript(&quot;return arguments[0].childNodes[0].textContent;&quot;, element).toString();
timeValues.add(myCountryProxy);
}
//Code to extract departure values which are same line of time with * i.e.: 05:10     05:30     05:50  
oas = driver.findElements(By.xpath(&quot;//li[@class=&#39;list-group-item scheduleEntryL&#39;]/span&quot;));
if(oas.size()&gt;0)
{
oas = driver.findElements(By.xpath(&quot;//li[@class=&#39;list-group-item scheduleEntryL&#39;]/span/..&quot;));
for(WebElement element:oas)
{
//here node which contains text is 4th child node, so we have passed value as childNodes[4]
myCountryProxy = ((JavascriptExecutor)driver).executeScript(&quot;return arguments[0].childNodes[3].textContent;&quot;, element).toString();
//Code to remove extra space from String
if(!myCountryProxy.equalsIgnoreCase(&quot; &#160;&#160;&#160; &quot;))
{					
myCountryProxy = myCountryProxy.replaceAll(&quot; &#160;&#160;&#160; &quot;,&quot; &quot;).replaceAll(&quot;&#160;&#160;&#160;&quot;,&quot;&quot;).replaceAll(&quot; &#160;&#160;&#160; &quot;,&quot;&quot;);
myCountryProxy = myCountryProxy.trim();
//Split string into individual value
String[] split = myCountryProxy.split(&quot;\\s+&quot;);
for(String str:split)
{
timeValues.add(str);
}					
}					
}
}
System.out.println(timeValues);

I have this executed code at my end and getting below output:

[06:10, 06:25, 06:40, 06:55, 07:05, 07:15, 07:25, 07:35, 07:45, 07:55, 08:05, 08:20, 08:30, 08:40, 08:55, 09:05, 09:15, 09:30, 09:50, 10:10, 10:25, 10:45, 11:00, 11:20, 11:35, 11:55, 12:10, 12:30, 12:45, 13:20, 13:40, 13:55, 14:10, 14:25, 14:35, 14:45, 15:00, 15:10, 15:20, 15:35, 15:45, 15:55, 16:10, 16:20, 16:30, 16:45, 16:55, 17:05, 17:20, 17:35, 17:55, 18:15, 18:30, 18:45, 19:00, 19:20, 19:35, 19:55, 20:10, 20:25, 20:40, 21:00, 21:20, 21:40, 22:05, 22:30, 22:55, 05:00*, 23:20*, 05:10, 05:30, 05:50]

Please note, I have added values from row which contains time with *, at end of list. So you will find that values at end of list.

To extract text between nodes, I have referred to: this SO post

Also, if you want to select drop down value, use Select class of selenium:

Select select = new Select(driver.findElement(By.id(&quot;lineSelect&quot;)));
//to select drop down value: 021 : ΠΛΑΤΕΙΑ ΚΑΝΙΓΓΟΣ - ΓΚΥΖH (ΚΥΚΛΙΚΗ)
select.selectByValue(&quot;1151_54_9&quot;);

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Selenium中获取无效类型的元素。

问题

答案1

Spring Boot: GraalVM原生镜像支持

如何保持与MySQL数据库的连接？

任务权限：在 Discord4J 中向成员添加角色时

如何根据字符串值打印结果？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论