英文:
regex pattern works in online tool, parses in NSRegularExpression, but fails to match anything
问题
以下是您要翻译的内容:
I am trying to match roman numerals from test strings like:
Series Name.disk_V.Episode_XI.Episode_name.avi
Series Name.Season V.Episode XI.Part XXV.Episode_name.avi
and a real-world example in which the XIII should not match:
XIII: The Series season II episode V.mp4
Following the logic in this fantastic thread and many experiments in an online regex debugger I came up with this:
(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s.-]\KM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})(?=[\s.-])
The last example returns two matches, "II" and "V", ignoring the XIII in the name part. Yay!
So then I tried it in a Swift playground:
let file = "Series Name.disk_V.Episode_XI.Episode_name.avi"
let p = #"(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s.-]\KM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})(?=[\s.-])"#
let r = try NSRegularExpression(pattern: p, options: [.caseInsensitive])
let nsString = file as NSString
let results = r.matches(in: suggestion, options: [], range: NSMakeRange(0, nsString.length))
The pattern parses without error but returns no matches. I found that it works if I remove the \K, although that leaves the leading separator in the match. According to this thread, Obj-C (which I assume means NSRegex) supports \K, so I'm not sure why this fails.
There are a number of similar-sounding threads here on SO, but they invariably have to do with patterns that fail to parse, mostly due to escaping. This is not the case here, it parses fine and I can see the pattern is correct (ie, no double-slashes) if you print(r)
. It just doesn't match.
Can anyone offer some insight or an alternative regex that does not use \K?
英文:
I am trying to match roman numerals from test strings like:
Series Name.disk_V.Episode_XI.Episode_name.avi
Series Name.Season V.Episode XI.Part XXV.Episode_name.avi
and a real-world example in which the XIII should not match:
XIII: The Series season II episode V.mp4
Following the logic in this fantastic thread and many experiments in an online regex debugger I came up with this:
(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-]\KM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})(?=[\s._-])
The last example returns two matches, "II" and "V", ignoring the XIII in the name part. Yay!
So then I tried it in a Swift playground:
let file = "Series Name.disk_V.Episode_XI.Episode_name.avi"
let p = #"(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-]\KM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})(?=[\s._-])"#
let r = try NSRegularExpression(pattern: p, options: [.caseInsensitive])
let nsString = file as NSString
let results = r.matches(in: suggestion, options: [], range: NSMakeRange(0, nsString.length))
The pattern parses without error but returns no matches. I found that it works if I remove the \K
, although that leaves the leading separator in the match. According to this thread, Obj-C (which I assume means NSRegex) supports \K
, so I'm not sure why this fails.
There are a number of similar-sounding threads here on SO, but they invariably have to do with patterns that fail to parse, mostly due to escaping. This is not the case here, it parses fine and I can see the pattern is correct (ie, no double-slashes) if you print(r)
. It just doesn't match.
Can anyone offer some insight or an alternative regex that does not use \K?
答案1
得分: 1
TheFourthBird的想法是解决方案。我通过移除\K并将整个罗马数字部分设为命名组来修改模式:
(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-](?&roman>M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3}))(?=[\s._-])
解析它时,首先按上面的一切进行,然后查找匹配的项目,像这样:
for result in results {
let nameRange = result.range(withName: "roman")
print(nsString.substring(with: nameRange))
}
输出:
V
XI
Bingo!
英文:
TheFourthBird's idea is the solution. I modified the pattern by removing the \K and making the entire roman section a named group:
(?<=d|dvd|disc|disk|s|se|season|e|ep|episode)[\s._-](?<roman>M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3}))(?=[\s._-])
To parse it, everything as above to start but then look for the matching items like this:
for result in results {
let nameRange = result.range(withName: "roman")
print(nsString.substring(with: nameRange))
}
Output:
V
XI
Bingo!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论