正则表达式来处理两种不同的文件扩展名

huangapple go评论69阅读模式
英文:

Regular expression to handle two different file extensions

问题

我正在尝试创建一个正则表达式,用于匹配文件名"abcd_04-04-2020.txt"或"abcd_04-04-2020.txt.gz"的情况。如何处理扩展名的"OR"条件?以下是我目前的进展:

if(fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.(txt|txt\\.gz))")){
    Pattern.compile("[._]+[0-9]{2}-[0-9]{2]-[0-9]{4}\\.");
}

这将处理".txt"和".txt.gz"两种情况。谢谢。

英文:

I am trying to create a regular expression that takes a file of name
"abcd_04-04-2020.txt" or "abcd_04-04-2020.txt.gz"

How can I handle the "OR" condition for the extension. This is what I have so far

if(fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}.[a-zA-Z]{3})")){
    Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}

This handles only the .txt. How can I handle ".txt.gz"
Thanks

答案1

得分: 2

为什么不直接使用 endsWith,而使用复杂的正则表达式呢?

if (fileName.endsWith(".txt") || fileName.endsWith(".txt.gz")) {
    Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
英文:

Why not just use endsWith instead complex regex

if(fileName.endsWith(".txt") || fileName.endsWith(".txt.gz")){
 Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}

答案2

得分: 2

你可以使用以下正则表达式来实现你的目标:

^[\w-]+\d{2}-\d{2}-\d{4}\.txt(?:\.gz)?$

上述正则表达式的解释:
> ^,$ - 分别匹配文本字符串的开头和结尾。
>
> [\w-]+ - 匹配单词字符和连字符,出现一次或多次。
>
> \d{} - 匹配指定花括号中所述次数的数字。
>
> (?:\.gz)? - 表示非捕获组,匹配.gz,出现零次或一次(因为有?量词)。你本可以使用|进行选择(或者,如你所期望的是“或”),但这种写法更易读,也更高效。

你可以在此处找到上述正则表达式的演示。

正则表达式来处理两种不同的文件扩展名

在Java中的实现:

import java.util.regex.*;
public class Main
{
    private static final Pattern pattern = Pattern.compile("^[\\w-]+\\d{2}-\\d{2}-\\d{4}\\.txt(?:\\.gz)?$", Pattern.MULTILINE);
    public static void main(String[] args) {
        String testString = "abcd_04-04-2020.txt\nabcd_04-04-2020.txt.gz\nsomethibsnfkns_05-06-2020.txt\n.txt.gz";
        Matcher matcher = pattern.matcher(testString);
        while(matcher.find()){
            System.out.println(matcher.group(0));
        }
    }
}

你可以在此处找到上述正则表达式在Java中的实现。

注意: 如果你想匹配有效的日期,可以参考这里

英文:

You can use the below regex to achieve your purpose:

^[\w-]+\d{2}-\d{2}-\d{4}\.txt(?:\.gz)?$

Explanation of the above regex:]
> ^,$ - Matches start and end of the test string resp.
>
> [\w-]+ - Matches word character along with hyphen one or more times.
>
> \d{} - Matches digits as many numbers as mentioned in the curly braces.
>
> (?:\.gz)? - Represents non-capturing group matching .gz zero or one time because of ? quantifier. You could have used | alternation( or as you were expecting OR) but this is legible and more efficient too.

You can find the demo of the above regex here.

正则表达式来处理两种不同的文件扩展名

IMPLEMENTATION IN JAVA:

import java.util.regex.*;
public class Main
{
    private static final Pattern pattern = Pattern.compile("^[\\w-]+\\d{2}-\\d{2}-\\d{4}\\.txt(?:\\.gz)?$", Pattern.MULTILINE);
    public static void main(String[] args) {
        String testString = "abcd_04-04-2020.txt\nabcd_04-04-2020.txt.gz\nsomethibsnfkns_05-06-2020.txt\n.txt.gz";
        Matcher matcher = pattern.matcher(testString);
        while(matcher.find()){
            System.out.println(matcher.group(0));
        }
    }
}

You can find the implementation of the above regex in java in here.

NOTE: If you want to match for valid dates also; please visit this.

答案3

得分: 1

我认为您想要的(根据您之前的方向)是这个:

[\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.[a-zA-Z]{3}(?:$|\\.[a-zA-Z]{2}$)

最后,我有一个条件语句。它必须要么匹配字符串的结尾($),要么匹配一个字面上的点,后面跟着两个字母(\\.[a-zA-Z]{2})。请记得转义.,因为在正则表达式中,. 表示“匹配任何字符”。

英文:

I think what you want (following from the direction you were going) is this:

[\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.[a-zA-Z]{3}(?:$|\\.[a-zA-Z]{2}$)

At the end, I have a conditional statement. It has to either match the end of the string ($) OR it has to match a literal dot followed by 2 letters (\\.[a-zA-Z]{2}). Remember to escape the ., because in regex . means "match any character".

答案4

得分: 1

你可以用 .txt(\.gz) 替换 .[a-zA-Z]{3}

if (fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}).txt(\\.gz)?")) {
    Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}
英文:

You can replace .[a-zA-Z]{3} with .txt(\.gz)

if(fileName.matches("([\\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}).txt(\.gz)?")){
   Pattern.compile("[._]+[0-9]{2}-[0-9]{2}-[0-9]{4}\\.");
}

答案5

得分: 1

?将适用于您所需的|。尝试将

(.[a-zA-Z]{2})?

添加到您的原始正则表达式

([\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}.[a-zA-Z]{3}(.[a-zA-Z]{2})?)
英文:

? will work for your required | . Try adding

(.[a-zA-Z]{2})?

to your original regex

([\w._-]+[0-9]{2}-[0-9]{2}-[0-9]{4}.[a-zA-Z]{3}(.[a-zA-Z]{2})?)

答案6

得分: 1

Pattern pattern = Pattern.compile("^[\\w._-]+_\\d{2}-\\d{2}-\\d{4}(\\.txt(\\.gz)?)$");

然后你可以运行以下测试

```java
String[] fileNames = {
        "abcd_04-04-2020.txt",
        "abcd_04-04-2020.tar",
        "abcd_04-04-2020.txt.gz",
        "abcd_04-04-2020.png",
        ".txt",
        ".txt.gz",
        "04-04-2020.txt"
};

Arrays.stream(fileNames)
        .filter(fileName -> pattern.matcher(fileName).find())
        .forEach(System.out::println);

// 输出
// abcd_04-04-2020.txt
// abcd_04-04-2020.txt.gz
英文:

A possible way of doing it:

Pattern pattern = Pattern.compile("^[\\w._-]+_\\d{2}-\\d{2}-\\d{4}(\\.txt(\\.gz)?)$");

Then you can run the following test:

String[] fileNames = {
        "abcd_04-04-2020.txt",
        "abcd_04-04-2020.tar",
        "abcd_04-04-2020.txt.gz",
        "abcd_04-04-2020.png",
        ".txt",
        ".txt.gz",
        "04-04-2020.txt"
};

Arrays.stream(fileNames)
        .filter(fileName -> pattern.matcher(fileName).find())
        .forEach(System.out::println);

// output
// abcd_04-04-2020.txt
// abcd_04-04-2020.txt.gz

huangapple
  • 本文由 发表于 2020年5月30日 01:08:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/62091205.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定