2017年9月8日 13:15:00go评论177阅读模式

英文:

How can I regex without certain suffices?

问题

https://regex101.com/ <- 供那些想要测试正则表达式的人使用。

我正在开发一个印尼价格解析器。

假设我有以下示例：

150 k
150 kilobyte
150 ka
150 k2
150 k)
150 k.

我们知道1)、5)、6)可以是价格，而其他的显然不是。

我的正则表达式实际上比较复杂，但为了简单起见，

假设我的正则表达式是：[0-9]+(\s*[k])

这可以匹配1)到6)中的所有内容。

所以我在正则表达式中加入了[^0-9a-zA-Z]：[0-9]+(\s*[k])[^0-9a-zA-Z]

现在我只得到了1)、5)、6)，这是可以的。

然而，问题是...它们有不必要的后缀，比如[ ) , ]

我该如何解析出只有'150 k'这样的内容，而没有与价格信息无关的后缀呢？

在获取5)、6)后，我是否需要再进行一次处理，手动去掉这些后缀？

提前感谢任何想法。

英文:

https://regex101.com/ <- for those who want to test regex.

I'm working on Indonesian price parser. 
Say, I have below examples:

150 k
150 kilobyte
150 ka
150 k2
150 k)
150 k. 
We know 1), 5), 6) can be the price, while remains obviously cannot be. 
My regex is bit complicated in real, but for simplicity, 
Let's say my regex is: [0-9]+(\s*[k]) 
This catches 1) to 6), all of them. 
So I put [^0-9a-zA-Z] to regex: [0-9]+(\s*[k])[^0-9a-zA-Z] 
Now I got 1), 5), 6) only, and this is fine. However, the problem is... they have unnecessary suffix like [ ) , ] 
How can I parse just '150 k' without any suffix like [ ) , ] which is not related to price information? 
Should I have one more process after get 5), 6) manually getting rid of those suffices?

Thank you in advance to any idea.

答案1

得分: 2

你可以使用一个单词边界 - \b。你也可以在开头使用它，而不是空格：

\b[0-9]+\s*k\b

工作示例：https://regex101.com/r/RAF2Vg/3

英文:

You can use a word boundary - \b. You can also use one at the start, instead of the space:

\b[0-9]+\s*k\b

Working example: https://regex101.com/r/RAF2Vg/3

答案2

得分: 2

我认为(\d+\s*k)\b会满足你的需求。它将检查在"k"之后是否达到了一个词边界。这个词边界可以是任何东西，是的，甚至可以是一个")"。请参考这个示例。

英文:

I think (\d+\s*k)\b will serve your purpose. It will check if after the 'k' a word boundary has been reached. This word boundary can be anything, yes, even a ). Look at this example

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在正则表达式中排除特定后缀？

问题

答案1

答案2

Go：从切片中删除多个条目的最快/最干净的方法是什么？

如何在 PostgreSQL 行内读取/写入对象数组？（GoLang）

为什么我无法在从转储构建HTTP请求时读取请求体？

将连续测试分散到4个Go协程中，并在其中一个失败时终止所有测试。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论