正则表达式带有(太多?)许多情况

huangapple go评论84阅读模式
英文:

Regular Expression with (too?) many cases

问题

我已经成功创建了以下表达式,直到可选的变量值或可选的注释部分:

/^(\/\/)?(#define)\s(\w+)\s?(.*[\/\/]?)?

我遇到的问题在第五(5)和第六(6)个匹配的第四(4)组中可见,值和前导注释最终被分为一组...我的目标是分别分组变量名、可选值和可选前导注释。

我基本上需要帮助\s?后面的部分:

/^(\/\/)?(#define)\s(\w+)\s? xxxxxxxxxx

英文:

I'm struggling with a Regex since a few hours and don't seem to find the last bit of the solution. I'm basically parsing a C-Header files line-by-line to find variables.

Following the possible cases of lines I may encounter which need to pass the Regex:

//#define variable_name { 300 }

#define variable_name { 300 }

//#define variable_name

#define variable_name

//#define variable_name { 300 } // Comment

#define variable_name { 300 } // Comment

#define variable_name // Comment

//#define variable_name // Comment

The following rules apply to each line above:

  • A line can start optionally with commenting slashes (i.e. //)
  • #define variable_name will always be present
  • A variable may optionally have a value (e.g. { 300 })
  • The variable value (if present) may be of all possible types (Text, Number or Vector)
  • A line may have a leading comment, either after the value or directly after the variable_name

I have been managing to create the following expression successfully up to the point of the optional variable value or optional comment:

/^(\/\/)?(#define)\s(\w+)\s?(.*[\/\/]?)?

The expression can be tested here: https://regex101.com/r/krZB71/3/

The problem I have is visible in the Group 4 of the fifth (5) and sixth (6) Match, the value and leading comment end up being grouped together... My aim is to separately group the variable_name, the optional value, the optional leading comment

I basically need help for the part after the \s?:

/^(\/\/)?(#define)\s(\w+)\s? xxxxxxxxxx

Any help highly appreciated

答案1

得分: 2

我发现使用以下表达式可以分隔变量和注释:

^(\/\/)?(#define)\s(\w+)\s?(.*?)(\/\/.*?)?$
英文:

I found that using the following expression separates the variable and the comment:

^(\/\/)?(#define)\s(\w+)\s?(.*?)(\/\/.*?)?$

答案2

得分: 0

你可以将模式更加具体,并使用可选的捕获组来获取单独的值。

^(\/\/)?(#define)\s(\w+)(?:\s?({[^{}]*})?\s?(\/\/\s?(.*))?)?
  • ^ 字符串开头
  • (\/\/)? 可选的 第1组,匹配 //
  • (#define)\s 捕获组2,匹配 #define 和空白字符
  • (\w+) 捕获 第3组,匹配1个或多个单词字符
  • (?: 非捕获组
  • \s? 匹配可选的空白字符
  • ( 可选的捕获 第4组
    • {[^{}]*} 匹配 {...}
  • )? 关闭第4组并使其变为可选
  • \s? 匹配可选的空白字符
  • ( 可选的捕获 第5组
    • \/\/\s? 匹配 //,然后可选的空白字符
    • (.*) 捕获 第6组,匹配除换行符外的任意字符
  • )? 关闭第5组并使其变为可选
  • )? 关闭非捕获组并使整个最后部分变为可选

正则表达式演示

注意\s 也会匹配换行符。如果您想匹配不包括换行符的空白字符,可以匹配制表符或空格 [\t ],或者匹配除换行符外的空白字符 [^\S\r\n]

英文:

You could make the pattern a bit more specify and use optional capture groups to get the separate values.

^(\/\/)?(#define)\s(\w+)(?:\s?({[^{}]*})?\s?(\/\/\s?(.*))?)?
  • ^ Start of string
  • (\/\/)? Optional group 1, match //
  • (#define)\s Capture group 2, match #define and whitespace char
  • (\w+) Capture group 3 Match 1+ word chars
  • (?: Non capture group
  • \s? Match optional whitespace char
  • ( Optional capture group 4
    • {[^{}]*} Match {...}
  • )? Close group 4 and make it optional
  • \s? Match optional whitespace char
  • ( Optional capture group 5
    • \/\/\s? Match // then optional whitespace char
    • (.*) Capture group 6 match any char except a newline
  • )? Close group 5 and make it optional
  • )? Close non capture group and make it optional so the whole last part is optional

Regex demo

Note that \s also matches a newline. If you want to match whitespace chars without the newlines you could match tabs or spaces [\t ] or match a whitespace char except the newlines [^\S\r\n].

huangapple
  • 本文由 发表于 2020年5月3日 18:58:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/61573319.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定