英文:
Regex to detect this pattern: something;something=something,something=something... for an unknown number of times
问题
以下是翻译好的部分:
明确规则:
- 
字符串由两部分组成,用分号分隔。
 - 
第一部分允许包含字母数字字符、短横线、下划线和点号。
 - 
字符串的第二部分包含键值对,键与其值之间用等号分隔,键值对用逗号分隔,我们不知道键值对会重复多少次。
 
示例:
blahblahblah;first=1,second=twobl.hbl-hbl_hbl4hbl4h;first=1,second=two,third=thr33
到目前为止,我提出的最佳正则表达式是([A-Za-z1-9_\-\.]+);(((.+?)(?:,|$))+),显然还远远不正确。我不擅长在正则表达式中使用前瞻、后顾等相对高级的东西,但我希望有一个正则表达式解决这个问题。
如果正则表达式引擎有关,我正在使用PHP 8.1中的兼容Perl的正则表达式引擎。
英文:
Explicit rules:
- 
The string has two parts, separated using a semicolon.
 - 
The first part is allowed to have alphanumeric characters, dashes, underscores and dots
 - 
The second part of the string contains key-value pairs where key is set to its value using an equality sign and the pairs are comma separated and we don't know how many times they're repeated beforehand
 
Examples:
blahblahblah;first=1,second=twobl.hbl-hbl_hbl4hbl4h;first=1,second=two,third=thr33
The best I've come up with so far is ([A-Za-z1-9_\-\.]+);(((.+?)(?:,|$))+) which is obviously far from correct. I am not good at writing regexps with lookaheads, lookbehinds, and other relatively advanced stuff in regex but I hope that a regex solution exists for this problem.
If regex engine matters, I am using the Perl-compatible regex engine in PHP 8.1
答案1
得分: 3
你可以尝试使用以下正则表达式:
^([\w.-]+);([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*)$
正则表达式解释:
^:字符串的开头([\w.-]+):第一个字符串,由字母数字字符、下划线、破折号和点组成;:分号([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*):键值对[A-Za-z]:字母字符\w+:字母数字字符的序列=:等号\w+:字母数字字符的序列(?:,[A-Za-z]\w+=\w+)*:非捕获组,包含可选的下一个键值对,:逗号[A-Za-z]:字母字符\w+:字母数字字符的序列=:等号\w+:字母数字字符的序列
$:字符串的结尾
在此处查看演示链接。
英文:
You can try with the following regex:
^([\w.-]+);([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*)$
Regex Explanation:
^: start of string([\w.-]+): first string, made of alphanumeric characters, underscores, dashes and dots;: semicolon([A-Za-z]\w*=\w+(?:,[A-Za-z]\w*=\w+)*): key-value pairs[A-Za-z]: alphabetical character\w+: sequence of alphanumerical characters=\w+: sequence of alphanumerical characters(?:,[A-Za-z]\w+=\w+)*: non-capturing group with the optional next key-value pairs,: comma[A-Za-z]: alphabetical character\w+: sequence of alphanumerical characters=\w+: sequence of alphanumerical characters
$: end of string
Check the demo here.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论