英文:
Regex to detect this pattern: something;something=something,something=something... for an unknown number of times
问题
以下是翻译好的部分:
明确规则:
-
字符串由两部分组成,用分号分隔。
-
第一部分允许包含字母数字字符、短横线、下划线和点号。
-
字符串的第二部分包含键值对,键与其值之间用等号分隔,键值对用逗号分隔,我们不知道键值对会重复多少次。
示例:
blahblahblah;first=1,second=two
bl.hbl-hbl_hbl4hbl4h;first=1,second=two,third=thr33
到目前为止,我提出的最佳正则表达式是([A-Za-z1-9_\-\.]+);(((.+?)(?:,|$))+)
,显然还远远不正确。我不擅长在正则表达式中使用前瞻、后顾等相对高级的东西,但我希望有一个正则表达式解决这个问题。
如果正则表达式引擎有关,我正在使用PHP 8.1中的兼容Perl的正则表达式引擎。
英文:
Explicit rules:
-
The string has two parts, separated using a semicolon.
-
The first part is allowed to have alphanumeric characters, dashes, underscores and dots
-
The second part of the string contains key-value pairs where key is set to its value using an equality sign and the pairs are comma separated and we don't know how many times they're repeated beforehand
Examples:
blahblahblah;first=1,second=two
bl.hbl-hbl_hbl4hbl4h;first=1,second=two,third=thr33
The best I've come up with so far is ([A-Za-z1-9_\-\.]+);(((.+?)(?:,|$))+)
which is obviously far from correct. I am not good at writing regexps with lookaheads, lookbehinds, and other relatively advanced stuff in regex but I hope that a regex solution exists for this problem.
If regex engine matters, I am using the Perl-compatible regex engine in PHP 8.1
答案1
得分: 3
你可以尝试使用以下正则表达式:
^([\w.-]+);([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*)$
正则表达式解释:
^
:字符串的开头([\w.-]+)
:第一个字符串,由字母数字字符、下划线、破折号和点组成;
:分号([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*)
:键值对[A-Za-z]
:字母字符\w+
:字母数字字符的序列=
:等号\w+
:字母数字字符的序列(?:,[A-Za-z]\w+=\w+)*
:非捕获组,包含可选的下一个键值对,
:逗号[A-Za-z]
:字母字符\w+
:字母数字字符的序列=
:等号\w+
:字母数字字符的序列
$
:字符串的结尾
在此处查看演示链接。
英文:
You can try with the following regex:
^([\w.-]+);([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*)$
Regex Explanation:
^
: start of string([\w.-]+)
: first string, made of alphanumeric characters, underscores, dashes and dots;
: semicolon([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*)
: key-value pairs[A-Za-z]
: alphabetical character\w+
: sequence of alphanumerical characters=
\w+
: sequence of alphanumerical characters(?:,[A-Za-z]\w+=\w+)*
: non-capturing group with the optional next key-value pairs,
: comma[A-Za-z]
: alphabetical character\w+
: sequence of alphanumerical characters=
\w+
: sequence of alphanumerical characters
$
: end of string
Check the demo here.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论