英文:
How to extract only number from string
问题
我尝试从这个网站中提取以美元为单位的价格文本。
我使用了一个selenium的定位器//span[@data-originalprice]
并尝试获取文本,但仍然没有得到只有数字的结果。我还尝试了在\\$
上分割文本,但什么都没有出现。我尝试了一些正则表达式text.split("^-?\\d*(\\.\\d+)?$")
,但仍然没有结果。有什么想法吗?
英文:
I try to extract the prices from these page as text in USD from this
site
I used an locator //span[@data-originalprice]
with get text of selenium
but still no only numbers, tried also split on \\$
and nothing came
tried some regex text.split("^-?\\d*(\\.\\d+)?$")
and still nothing.
looking for any idea?
答案1
得分: 0
要提取并打印去除非ASCII字符的“prices”,您可以使用**replaceAll(""[^\\p{ASCII}]"", "")
**,并且使用Java 8的stream()
和map()
,您可以使用以下任一定位策略:
-
cssSelector:
driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77"); System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("div.associated-product__price p>span"))).stream().map(element->element.getText().replaceAll(""[^\\p{ASCII}]"", "")).collect(Collectors.toList()));
-
xpath:
driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77"); System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//div[@class='associated-product__price']//p/span"))).stream().map(element->element.getText().replaceAll(""[^\\p{ASCII}]"", "")).collect(Collectors.toList()));
-
控制台输出:
[7,035.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 5,607.00, 5,607.00, 5,607.00, 4,996.00, 7,646.00]
引用
您可以在以下几个相关讨论中找到相关内容:
英文:
To extract and print the prices trimming the non-ASCII characters you can use replaceAll("[^\\p{ASCII}]", "")
and using Java8's stream()
and map()
you can use either of the following Locator Strategies:
-
cssSelector:
driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77"); System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("div.associated-product__price p>span"))).stream().map(element->element.getText().replaceAll("[^\\p{ASCII}]", "")).collect(Collectors.toList()));
-
xpath:
driver.get("https://www.wooloverslondon.com/new-styles?page=1&gender=161&style=77"); System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//div[@class='associated-product__price']//p/span"))).stream().map(element->element.getText().replaceAll("[^\\p{ASCII}]", "")).collect(Collectors.toList()));
-
Console Output:
[7,035.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 6,015.00, 5,607.00, 5,607.00, 5,607.00, 4,996.00, 7,646.00]
References
You can find a couple of relevant discussions in:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论