2017年8月24日 05:22:12go评论87阅读模式

英文:

Regex for string representation of a method call

问题

我有一个遵循特定模式的字符串，如下所示：
operator(field,value)

我想使用正则表达式提取出操作符、字段和值。我在如何捕获它们的语法上遇到了困难。在这种情况下，值也可以是字母数字的组合，例如：

"contains(name, Joe)"
或者 "lt(quantity, 2.5)"

英文:

I have a string that follows a specific pattern like so
operator(field,value)

and I'd like to use regex to extract out all three of operator, field and value. I'm struggling to come up with the syntax for how to capture these. In this case value can be alphanumeric as well, for example

"contains(name, Joe)"
or "lt(quantity, 2.5)"

答案1

得分: 0

我不懂Go语言，但我了解正则表达式，所以我会尽力帮助你。

你可能希望为“操作符”、“字段”和“值”分别设置一个分组。我现在假设每个分组都可以由字母、数字或下划线的任意组合表示，长度至少为一个字符。在正则表达式中，我们有一个快捷方式：\w表示一个字母、数字或下划线字符，+修饰符表示“一个或多个”。因此，\w+表示连续一个或多个这样的字符。如果你希望对这些字段的命名有更复杂的定义，你可以在问题中具体说明。

你说你想支持“操作符(字段,值)”的格式。我将先不考虑任何空格，因为这样更简单，你可以在运行正则表达式之前自行删除所有空格。如果你希望添加一些空格支持到正则表达式中，我们稍后可以做调整，但这会增加一些复杂性。

为此，我们需要三个分组，“1(2,3)”，其中1是操作符名称，2是字段名称，3是值名称。根据上述要求，每个分组在正则表达式中表示为\w+。我们还希望匹配括号和逗号，但我们会将它们丢弃，因为它们只是分隔符。由于正则表达式对括号有特殊含义，所以括号在正则表达式中需要转义。结果如下所示：

(\w+)\((\w+),(\w+)\)
\ 1 /  \ 2 / \ 3 /

第二行显示了每个分组的定义位置。

如果你想支持一些空格，你需要在所有这些位置添加\s*。这会变得复杂，但你可以这样做：

(\w+)\s*\(\s*(\w+)\s*,\s*(\w+)\s*\)
\ 1 /        \ 2 /       \ 3 /

你举了一个支持浮点数值的例子，我假设还有其他类型的值。你可以使用“或”管道符号|来实现。例如，第三个分组，不仅仅是\w+，可以定义为：

[a-zA-Z_]\w*|\d+\.?|\d*\.\d+

这个字符串将支持字母数字加下划线字符串，其中第一个字符必须是字母或下划线，或者整数，或者浮点数（定义为以句点开头、中间或结尾的整数字符串）。显然，这可以继续扩展以支持更复杂的字符串值，但你明白我的意思。

因此，最终的正则表达式可能如下所示：

(\w+)\s*\(\s*(\w+)\s*,\s*([a-zA-Z_]\w+|\d+\.?|\d*\.\d+)\s*\)

很抱歉没有提供任何关于Go语言的帮助，希望其他人可以编辑我的回答并填补这个重要的空白。

英文:

I don't know golang, but I do know regex's, so I'll do what I can here.

You probably want a group each for the "operator", "field", and "value". I'm going to assume for now that each of these can be represented as any combination of alphabetic, numeric, or underscore characters, with length of at least one character. In regex, we have a shortcut for that: \w represents a single alpha-numeric or underscore character, and the + modifier means "one or more". So \w+ means one or more such character in a row. If you want a more complex definition of what these fields can be named, I'll let you specify that in your question.

You say that you want to support "operator(field,value)". I'll start without whitespace anywhere, because it's simpler and you can easily remove all whitespace yourself before running the regex. We'll later add some whitespace support to the regex if you want it, but it'll make life difficult.

To do this, we want three groups, "1(2,3)" where 1 is the operator name, 2 is the field name, and 3 is the value name. Each of these, as given above, will be \w+ in our regex. We'll want to match the open and close parentheses as well as the comma, but we'll throw them away because they're really just delimiters. The parentheses will need to be escaped in the regex, since regex's have a special meaning for parentheses. The result looks like:

(\w+)\((\w+),(\w+)\)
\ 1 /  \ 2 / \ 3 /

Where the second line shows you where the groups are each defined.

If you want to support some whitespace, you'll need to add \s* in all such locations. This gets hairy, but you can do it as such:

(\w+)\s*\(\s*(\w+)\s*,\s*(\w+)\s*\)
\ 1 /        \ 2 /       \ 3 /

You give an example of wanting to support floating point values, and I presume other kinds of values too. You can accomplish this using the "or" pipe, |. For example, group 3, instead of just being \w+, could be defined as

[a-zA-Z_]\w*|\d+\.?|\d*\.\d+

This string will support alphanumeric+underscore strings where the first character must be alphabetic or underscore, OR integers, OR floating point (defined as an integer string with a period at the beginning, middle, or end). Clearly, this can go on and on to support more complex string values, but you get the idea.

So the final regex might look like:

(\w+)\s*\(\s*(\w+)\s*,\s*([a-zA-Z_]\w+|\d+\.?|\d*\.\d+)\s*\)

Sorry for not giving any golang help, I hope someone else can edit my answer and fill in that major gap.

答案2

得分: 0

使用类似以下的代码来捕获分组，你可以使用[]来限制接受的字符，注意正则表达式中的`和对()的转义：

func main() {
    re := regexp.MustCompile(`(.+)\((.+),\s?(.+)\)`)
    for _, t := range tests {
        fmt.Println("result", re.FindStringSubmatch(t))
    }
}

输出结果：

result [contains(field, value) contains field value]
result [contains(name, Joe) contains name Joe]
result [lt(quantity, 2.5) lt quantity 2.5]
result [plus(no,44) plus no 44]

根据你想要的严格程度，你可以使用[a-z]+或类似的表达式来匹配特定字符，而不是使用.+来匹配任意字符，但如果你不担心无效的值，这种方式可能是可以的。

英文:

Use something like this to capture groups, you may want to limit the characters accepted with [], note the use of ` and the use of \ escaping for () within the regexp:

func main() {
    re := regexp.MustCompile(`(.+)\((.+),\s?(.+)\)`)
    for _, t := range tests {
		fmt.Println(&quot;result&quot;, re.FindStringSubmatch(t))
	}
}

https://play.golang.org/p/43YLTafgQt

output:

result [contains(field, value) contains field value]
result [contains(name, Joe) contains name Joe]
result [lt(quantity, 2.5) lt quantity 2.5]
result [plus(no,44) plus no 44]

Depending on how strict you want to be you could use [a-z]+ or similar instead of .+ to match only certain characters but if you are not worried about bogus values this would probably be fine.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

字符串表示方法调用的正则表达式

问题

答案1

答案2

如何在Golang中使用其他文件中的其他结构方法

将两个范围合并以给出名称和超链接

如何使用golang从MongoDB数组中删除第N个元素？

无法连接到我的容器化 gRPC 服务器。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论