正则表达式 – isMediaFile

huangapple go评论71阅读模式
英文:

Regular Expression - isMediaFile

问题

我正在尝试进行媒体文件检查,但是它失败了,

它返回了 false:

我的表达式:

MEDIA_PATTERN = "([^\\s]+(\\.(?i)(aif|iff|m3u|m4a|mid|mp3|mpa|wav|wma|3g2|3gp|asf|avi|flv|m4v|mov|mp4|mpg|rm|srt|swf|vob|wmv|3d))$)";

我改成了

MEDIA_PATTERN = "([^.*]+(\.(?i)(m3u|m4a|mp3|mpa|mkv|wav|avi|flv|m4v|mov|mp4|vob))$)";

它可以工作,但如果文件名像这样:

System.out.println(isMediaFile("Chaplin.1992.720p.BrRip.x264.YIFY.mp4"));

我的样例文件名:

public static void isMediaFileTest(){
        System.out.println(isMediaFile("Electrifying Bhupalam Thillana - Sridevi Nrithyalaya - Bharathanatyam Dance.mp4"));
        System.out.println(isMediaFile("Electrifying Bhupalam Thillana Sridevi Nrithyalaya Bharathanatyam Dance.mp4"));

    }


 public static boolean isMediaFile(String str)
    {
        Pattern p = Pattern.compile(Constants.MEDIA_PATTERN);
        if (str == null) {
            return false;
        }
        return p.matcher(str).matches();
    }

问题出在哪里。

英文:

I am trying to do a medio file checking, But it is failing,

It is giving false:

My Expression:

MEDIA_PATTERN = "([^\\s]+(\\.(?i)(aif|iff|m3u|m4a|mid|mp3|mpa|wav|wma|3g2|3gp|asf|avi|flv|m4v|mov|mp4|mpg|rm|srt|swf|vob|wmv|3d))$)";

I changed to

MEDIA_PATTERN = "([^.*]+(\.(?i)(m3u|m4a|mp3|mpa|mkv|wav|avi|flv|m4v|mov|mp4|vob))$)";

It working but It still not working if the file name is like this:

System.out.println(isMediaFile("Chaplin.1992.720p.BrRip.x264.YIFY.mp4"));

My Sample File Name:

public static void isMediaFileTest(){
        System.out.println(isMediaFile("Electrifying Bhupalam Thillana - Sridevi Nrithyalaya - Bharathanatyam Dance.mp4"));
        System.out.println(isMediaFile("Electrifying Bhupalam Thillana Sridevi Nrithyalaya Bharathanatyam Dance.mp4"));

    }


 public static boolean isMediaFile(String str)
    {
        Pattern p = Pattern.compile(Constants.MEDIA_PATTERN);
        if (str == null) {
            return false;
        }
        return p.matcher(str).matches();
    }

What is the wrong.

答案1

得分: 1

你可以像这样定义 MEDIA_PATTERN

MEDIA_PATTERN = "(?i).*\\.(?:aif|iff|m3u|m4a|mid|mp3|mpa|wav|wma|3g2|3gp|asf|avi|flv|m4v|mov|mp4|mpg|rm|srt|swf|vob|wmv|3d)";

其中:

  • [^\s]+(等同于 \S+,仅匹配一个或多个非空白字符)被替换为 .*,用于匹配除换行符之外的任意多个字符(.* 在这里是必要的,因为正则表达式将与 Matcher#matches() 方法一起使用,该方法要求完整字符串匹配)。
  • $ 被删除,因为它是多余的(由于在 .matches() 方法中的使用)。
  • 一些多余的捕获组也被删除。
  • (?i) 最好放在模式的开头,它应该修改整个模式。

你还应该在使用方法之外声明 Pattern,以提高性能(参见@k314159的评论):

public static Pattern p = Pattern.compile(Constants.MEDIA_PATTERN);
public static boolean isMediaFile(String str) {
    if (str == null) {
        return false;
    }
    return p.matcher(str).matches();
}

这个在线的 Java 演示中查看。

英文:

You may define the MEDIA_PATTERN like

MEDIA_PATTERN = "(?i).*\\.(?:aif|iff|m3u|m4a|mid|mp3|mpa|wav|wma|3g2|3gp|asf|avi|flv|m4v|mov|mp4|mpg|rm|srt|swf|vob|wmv|3d)"

where

  • [^\s]+ (equal to \S+ and matching only one or more non-whitespace chars) is replaced with .* that matches 0 or more chars other than line break chars as many as possible (.* is necessary as the regex is used with Matcher#matches() method that requires a full string match)
  • $ removed since it is redundant (due to the use in .matches() method)
  • Some redundant capturing groups are also removed
  • (?i) is better placed at the start of the pattern here, it should modify the whole pattern anyway.

You should also declare the Pattern outside the method where it is used for better performance (see @k314159's comment),

public static Pattern p = Pattern.compile(Constants.MEDIA_PATTERN);
public static boolean isMediaFile(String str)
{
    if (str == null) {
        return false;
    }
    return p.matcher(str).matches();
}

See the Java demo online.

huangapple
  • 本文由 发表于 2020年9月10日 16:17:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/63825576.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定