英文:
Regular expression to handle two different file extensions
问题
我正在尝试创建一个正则表达式,用于匹配文件名"abcd_04-04-2020.txt"或"abcd_04-04-2020.txt.gz"的情况。如何处理扩展名的"OR"条件?以下是我目前的进展:
if(fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.(txt|txt\\.gz))")){
Pattern.compile("[._]+[0-9]{2}-[0-9]{2]-[0-9]{4}\\.");
}
这将处理".txt"和".txt.gz"两种情况。谢谢。
英文:
I am trying to create a regular expression that takes a file of name
"abcd_04-04-2020.txt" or "abcd_04-04-2020.txt.gz"
How can I handle the "OR" condition for the extension. This is what I have so far
if(fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}.[a-zA-Z]{3})")){
Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
This handles only the .txt. How can I handle ".txt.gz"
Thanks
答案1
得分: 2
为什么不直接使用 endsWith
,而使用复杂的正则表达式呢?
if (fileName.endsWith(".txt") || fileName.endsWith(".txt.gz")) {
Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
英文:
Why not just use endsWith
instead complex regex
if(fileName.endsWith(".txt") || fileName.endsWith(".txt.gz")){
Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
答案2
得分: 2
你可以使用以下正则表达式来实现你的目标:
^[\w-]+\d{2}-\d{2}-\d{4}\.txt(?:\.gz)?$
上述正则表达式的解释:
> ^,$
- 分别匹配文本字符串的开头和结尾。
>
> [\w-]+
- 匹配单词字符和连字符,出现一次或多次。
>
> \d{}
- 匹配指定花括号中所述次数的数字。
>
> (?:\.gz)?
- 表示非捕获组,匹配.gz
,出现零次或一次(因为有?量词)。你本可以使用|
进行选择(或者,如你所期望的是“或”),但这种写法更易读,也更高效。
你可以在此处找到上述正则表达式的演示。
在Java中的实现:
import java.util.regex.*;
public class Main
{
private static final Pattern pattern = Pattern.compile("^[\\w-]+\\d{2}-\\d{2}-\\d{4}\\.txt(?:\\.gz)?$", Pattern.MULTILINE);
public static void main(String[] args) {
String testString = "abcd_04-04-2020.txt\nabcd_04-04-2020.txt.gz\nsomethibsnfkns_05-06-2020.txt\n.txt.gz";
Matcher matcher = pattern.matcher(testString);
while(matcher.find()){
System.out.println(matcher.group(0));
}
}
}
你可以在此处找到上述正则表达式在Java中的实现。
注意: 如果你想匹配有效的日期,可以参考这里。
英文:
You can use the below regex to achieve your purpose:
^[\w-]+\d{2}-\d{2}-\d{4}\.txt(?:\.gz)?$
Explanation of the above regex:]
> ^,$
- Matches start and end of the test string resp.
>
> [\w-]+
- Matches word character along with hyphen one or more times.
>
> \d{}
- Matches digits as many numbers as mentioned in the curly braces.
>
> (?:\.gz)?
- Represents non-capturing group matching .gz
zero or one time because of ? quantifier. You could have used |
alternation( or as you were expecting OR) but this is legible and more efficient too.
You can find the demo of the above regex here.
IMPLEMENTATION IN JAVA:
import java.util.regex.*;
public class Main
{
private static final Pattern pattern = Pattern.compile("^[\\w-]+\\d{2}-\\d{2}-\\d{4}\\.txt(?:\\.gz)?$", Pattern.MULTILINE);
public static void main(String[] args) {
String testString = "abcd_04-04-2020.txt\nabcd_04-04-2020.txt.gz\nsomethibsnfkns_05-06-2020.txt\n.txt.gz";
Matcher matcher = pattern.matcher(testString);
while(matcher.find()){
System.out.println(matcher.group(0));
}
}
}
You can find the implementation of the above regex in java in here.
NOTE: If you want to match for valid dates also; please visit this.
答案3
得分: 1
我认为您想要的(根据您之前的方向)是这个:
[\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.[a-zA-Z]{3}(?:$|\\.[a-zA-Z]{2}$)
最后,我有一个条件语句。它必须要么匹配字符串的结尾($
),要么匹配一个字面上的点,后面跟着两个字母(\\.[a-zA-Z]{2}
)。请记得转义.
,因为在正则表达式中,.
表示“匹配任何字符”。
英文:
I think what you want (following from the direction you were going) is this:
[\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.[a-zA-Z]{3}(?:$|\\.[a-zA-Z]{2}$)
At the end, I have a conditional statement. It has to either match the end of the string ($
) OR it has to match a literal dot followed by 2 letters (\\.[a-zA-Z]{2}
). Remember to escape the .
, because in regex .
means "match any character".
答案4
得分: 1
你可以用 .txt(\.gz)
替换 .[a-zA-Z]{3}
if (fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}).txt(\\.gz)?")) {
Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
英文:
You can replace .[a-zA-Z]{3}
with .txt(\.gz)
if(fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}).txt(\.gz)?")){
Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
答案5
得分: 1
?将适用于您所需的|。尝试将
(.[a-zA-Z]{2})?
添加到您的原始正则表达式
([\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}.[a-zA-Z]{3}(.[a-zA-Z]{2})?)
英文:
? will work for your required | . Try adding
(.[a-zA-Z]{2})?
to your original regex
([\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}.[a-zA-Z]{3}(.[a-zA-Z]{2})?)
答案6
得分: 1
Pattern pattern = Pattern.compile("^[\\w._-]+_\\d{2}-\\d{2}-\\d{4}(\\.txt(\\.gz)?)$");
然后你可以运行以下测试:
```java
String[] fileNames = {
"abcd_04-04-2020.txt",
"abcd_04-04-2020.tar",
"abcd_04-04-2020.txt.gz",
"abcd_04-04-2020.png",
".txt",
".txt.gz",
"04-04-2020.txt"
};
Arrays.stream(fileNames)
.filter(fileName -> pattern.matcher(fileName).find())
.forEach(System.out::println);
// 输出
// abcd_04-04-2020.txt
// abcd_04-04-2020.txt.gz
英文:
A possible way of doing it:
Pattern pattern = Pattern.compile("^[\\w._-]+_\\d{2}-\\d{2}-\\d{4}(\\.txt(\\.gz)?)$");
Then you can run the following test:
String[] fileNames = {
"abcd_04-04-2020.txt",
"abcd_04-04-2020.tar",
"abcd_04-04-2020.txt.gz",
"abcd_04-04-2020.png",
".txt",
".txt.gz",
"04-04-2020.txt"
};
Arrays.stream(fileNames)
.filter(fileName -> pattern.matcher(fileName).find())
.forEach(System.out::println);
// output
// abcd_04-04-2020.txt
// abcd_04-04-2020.txt.gz
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论