如何从字符串中提取仅数字部分。

huangapple go评论72阅读模式
英文:

How to extract only number from string

问题

我尝试从这个网站中提取以美元为单位的价格文本。

我使用了一个selenium的定位器//span[@data-originalprice]并尝试获取文本,但仍然没有得到只有数字的结果。我还尝试了在\\$上分割文本,但什么都没有出现。我尝试了一些正则表达式text.split("^-?\\d*(\\.\\d+)?$"),但仍然没有结果。有什么想法吗?

英文:

I try to extract the prices from these page as text in USD from this
site

I used an locator //span[@data-originalprice] with get text of selenium
but still no only numbers, tried also split on \\$ and nothing came
tried some regex text.split("^-?\\d*(\\.\\d+)?$") and still nothing.
looking for any idea?

答案1

得分: 0

要提取并打印去除非ASCII字符的“prices”,您可以使用**replaceAll(""[^\\p{ASCII}]"", "")**,并且使用Java 8的stream()map(),您可以使用以下任一定位策略

  • cssSelector

    driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77");
    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("div.associated-product__price p>span"))).stream().map(element->element.getText().replaceAll(""[^\\p{ASCII}]"", "")).collect(Collectors.toList()));
    
  • xpath

    driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77");
    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//div[@class='associated-product__price']//p/span"))).stream().map(element->element.getText().replaceAll(""[^\\p{ASCII}]"", "")).collect(Collectors.toList()));
    
  • 控制台输出:

    [7,035.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 5,607.00, 5,607.00, 5,607.00, 4,996.00, 7,646.00]
    

引用

您可以在以下几个相关讨论中找到相关内容:

英文:

To extract and print the prices trimming the non-ASCII characters you can use replaceAll("[^\\p{ASCII}]", "") and using Java8's stream() and map() you can use either of the following Locator Strategies:

  • cssSelector:

    driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77");
    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("div.associated-product__price p>span"))).stream().map(element->element.getText().replaceAll("[^\\p{ASCII}]", "")).collect(Collectors.toList()));
    
  • xpath:

    driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77");
    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//div[@class='associated-product__price']//p/span"))).stream().map(element->element.getText().replaceAll("[^\\p{ASCII}]", "")).collect(Collectors.toList()));
    
  • Console Output:

    [7,035.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 5,607.00, 5,607.00, 5,607.00, 4,996.00, 7,646.00]
    

References

You can find a couple of relevant discussions in:

huangapple
  • 本文由 发表于 2020年9月23日 01:52:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/64015202.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定