英文:
Regex to filter word with suffix from string
问题
我目前正在开发一个.NET 4.6.2应用程序。
我需要编写一个正则表达式来筛选特定的文件。
文件名必须不包括单词"house",以及文件后缀png、jpg或gif。
到目前为止,我得出了这个正则表达式:
Regex regex = new Regex(@"\b\w*house\w*\b.+.(jpg|png|gif)$");
它似乎可以很好地处理以下单词:
- zt_housedsaf-34.png
- housedsaf-34.gif
但它无法过滤这些单词,即:
- house.gif
- 123house.png
您知道如何编写一个正则表达式来解决这个问题吗?
英文:
I'm currently working on a .NET 4.6.2 application.
I need to write a regex to filter certain files.
The filename must not include the word "house" as well as the file suffix png, jpg, or gif.
So far I came up with this regex:
Regex regex = new Regex(@"\b\w*house\w*\b.+.(jpg|png|gif)$");
It seems to work fine with the following words:
- zt_housedsaf-34.png
- housedsaf-34.gif
But it doesn't filter these words i.e.:
- house.gif
- 123house.png
Do you know how to write a regex to solve this issue?
答案1
得分: 1
模式与最后两个字符串不匹配,因为 .+
匹配 1 个或多个字符,紧随其后的 .
也匹配一个字符。
所以,在匹配 house
后,应该有 2 个任意字符,然后匹配任何一个替代项 jpg
、png
或 gif
。
根据允许的字符,您可以匹配 0 个或多个字符,然后转义句点以确实匹配它。
如果您不需要捕获后缀,可以将替代项放在非捕获组中:
\b\w*house\w*\b.*\.(?:jpg|png|gif)$
或者,您可以缩小允许的字符范围,仅匹配单词字符和连字符,然后从不带单词边界的地方开始匹配单词字符:
\w*house[\w-]*\.(?:jpg|png|gif)$
英文:
The pattern does not match the last 2 strings because .+
matches 1 or more characters and the .
after it also matches a character.
So after matching house
there should be 2 of any characters after it, and then match any of the alternatives jpg
png
gif
.
Depending on the allowed characters, you could match 0 or more characters followed by escaping the dot to match it literally.
If you don't need to capture the suffix, you can wrap the alternatives in a non capture group:
\b\w*house\w*\b.*\.(?:jpg|png|gif)$
Or you could narrow down the allowed characters matching only word chars and a hyphen and start the pattern matching word chars without a word boundary:
\w*house[\w-]*\.(?:jpg|png|gif)$
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论