英文:
Regex to match movie file
问题
Here's the translated code part:
我试图编写一些正则表达式来匹配文件中的电影标题。 正则表达式应该匹配所有示例文件中的标题。 我目前只能使用此正则表达式 `^(.+).(\d{4}p)` 来使其中一些工作。 
我在Java中使用它,来自java.util.regex包
我希望它在电影文件格式为以下情况下能够工作:
 - {标题} {年份} {分辨率} 等。
 - {标题} {分辨率} {年份} 等。
 - {标题} {分辨率} 等。
 - {标题} {年份} 等。
 - 当电影包含年份或仅为年份时,如电影:2012(2009)
**示例文件:**
```java
Film.2017.720p.BluRay.H264.AAC.mp4
Film.And.The.Film.2017.1080p.BluRay.x264.mp4
152.Seconds.2010.1080p.BluRay.x264.mp4
2015.2005.1080p.BluRay.x264.mp4
Java 代码:
public static void main(String[] args)
{
    ArrayList<String> movies = new ArrayList<>();
    movies.add("Film.2017.720p.BluRay.H264.AAC.mp4");
    movies.add("Film.And.The.Film.2017.1080p.BluRay.x264.mp4");
    movies.add("152.Seconds.2010.1080p.BluRay.x264.mp4");
    movies.add("2015.2005.1080p.BluRay.x264.mp4");
    for (String s : movies)
    {
        System.out.println("原始文件: \t" + s);
        System.out.println("新文件: \t\t" + getTitleFromFile(s) + "\n");
    }
}
    
private static String getTitleFromFile(String fileName)
{
    Pattern pattern = Pattern.compile("^(.+).(\\d{4}p)");
    Matcher m = pattern.matcher(fileName);
    if (m.find())
    {
        return m.group();
    }
    else
    {
        return null;
    }
}
实际输出:
原始文件: 	Film.2017.720p.BluRay.H264.AAC.mp4
新文件: 		null
原始文件: 	Film.And.The.Film.2017.1080p.BluRay.x264.mp4
新文件: 		null
原始文件: 	Film 2015 1080p BluRay x264 DTS.mp4
新文件: 		Film 2015 1080p
原始文件: 	Film.1080p.BrRip.x264.mp4
新文件: 		Film.1080p
预期输出:
原始文件: 	Film.2017.720p.BluRay.H264.AAC.mp4
新文件: 		Film
原始文件: 	Film.And.The.Film.2017.1080p.BluRay.x264.mp4
新文件: 		Film And The Film
原始文件: 	Film 2015 1080p BluRay x264 DTS.mp4
新文件: 		Film
原始文件: 	Film.1080p.BrRip.x264.mp4
新文件: 		Film
英文:
I was trying to write some regex to match the title of a movie from a file. The regex should match the title from all the example files. I can only get it to work for some of them currently with this regex ^(.+).(\d{4}p).
I am using this in Java from the package java.util.regex
I would like it to work when the movie file format is:
- {title} {year} {resolution} etc.
 - {title} {resolution} {year} etc.
 - {title} {resolution} etc.
 - {title} {year} etc.
 - when the movie contains a year or is just a year like the movie: 2012 (2009)
 
Example files:
Film.2017.720p.BluRay.H264.AAC.mp4
Film.And.The.Film.2017.1080p.BluRay.x264.mp4
152.Seconds.2010.1080p.BluRay.x264.mp4
2015.2005.1080p.BluRay.x264.mp4
Java code:
public static void main(String[] args)
{
    ArrayList<String> movies = new ArrayList<>();
    movies.add("Film.2017.720p.BluRay.H264.AAC.mp4");
    movies.add("Film.And.The.Film.2017.1080p.BluRay.x264.mp4");
    movies.add("152.Seconds.2010.1080p.BluRay.x264.mp4");
    movies.add("2015.2005.1080p.BluRay.x264.mp4");
    for (String s : movies)
    {
        System.out.println("original file: \t" + s);
        System.out.println("new file: \t\t" + getTitleFromFile(s) + "\n");
    }
}
    
private static String getTitleFromFile(String fileName)
{
    Pattern pattern = Pattern.compile("^(.+).(\\d{4}p)");
    Matcher m = pattern.matcher(fileName);
    if (m.find())
    {
        return m.group();
    }
    else
    {
        return null;
    }
}
Actual Output:
original file: 	Film.2017.720p.BluRay.H264.AAC.mp4
new file: 		null
original file: 	Film.And.The.Film.2017.1080p.BluRay.x264.mp4
new file: 		null
original file: 	Film 2015 1080p BluRay x264 DTS.mp4
new file: 		Film 2015 1080p
original file: 	Film.1080p.BrRip.x264.mp4
new file: 		Film.1080p
Expected Output:
original file: 	Film.2017.720p.BluRay.H264.AAC.mp4
new file: 		Film
original file: 	Film.And.The.Film.2017.1080p.BluRay.x264.mp4
new file: 		Film And The Film
original file: 	Film 2015 1080p BluRay x264 DTS.mp4
new file: 		Film
original file: 	Film.1080p.BrRip.x264.mp4
new file: 		Film
答案1
得分: 1
以下是代码部分的翻译:
List<String> strs = Arrays.asList(
    "Film.The.Film.720p.BrRip.x264.BOKUTOX.mp4",
    "Film.The.Film.2020.BrRip.x264.mp4",
    "Film.The.Film.720p.2020.BrRip.x264.mp4",
    "Film.The.Film.720p.BrRip.x264.mp4"
);
Pattern p = Pattern.compile("^(.*?)\\W(?:(\\d{4})(?:\\W(\\d+p)?)|(\\d+p)(?:\\W(\\d{4}))?)\\b");
for (String str : strs) {
    Matcher m = p.matcher(str);
    if (m.find()) {
        System.out.println("\n--------\nName: " + m.group(1).replace(".", " "));
        if (m.group(2) != null) {
            System.out.println("Year: " + m.group(2));
            if (m.group(3) != null) {
                System.out.println("Resolution: " + m.group(3));
            }
        } else {
            System.out.println("Resolution: " + m.group(4));
            if (m.group(5) != null) {
                System.out.println("Year: " + m.group(5));
            }
        }
    }
}
希望这能帮助你。
英文:
You may use
^(.*?)\W(?:(\d{4})(?:\W(\d+p)?)|(\d+p)(?:\W(\d{4}))?)\b
See the regex demo.
Details
^- start of string(.*?)- Group 1: name, any 0 or more chars other than line break chars, as few as possible\W- a non-word char(?:(\d{4})(?:\W(\d+p)?)|(\d+p)(?:\W(\d{4}))?)- either of(\d{4})(?:\W(\d+p)?)- Group 2 - four digits followed with an optional group matching a non-word char and then one or more digits andpcaptured in Group 3|- or(\d+p)(?:\W(\d{4}))?- Group 4 - one or more digits andpfollowed with an optional group matching a non-word char and then four digits captured in Group 5
\b- word boundary
Java demo:
List<String> strs = Arrays.asList("Film.The.Film.720p.BrRip.x264.BOKUTOX.mp4",
	     "Film.The.Film.2020.BrRip.x264.mp4",
	     "Film.The.Film.720p.2020.BrRip.x264.mp4", 
	     "Film.The.Film.720p.BrRip.x264.mp4");
Pattern p = Pattern.compile("^(.*?)\\W(?:(\\d{4})(?:\\W(\\d+p)?)|(\\d+p)(?:\\W(\\d{4}))?)\\b");
for (String str : strs) {
	Matcher m = p.matcher(str);
	if (m.find()) {
		System.out.println("\n--------\nName: " + m.group(1).replace(".", " "));
		if (m.group(2) != null) {
			System.out.println("Year: " + m.group(2));
			if (m.group(3) != null) {
				System.out.println("Resolution: " + m.group(3));
			}
		}
		else {
			System.out.println("Resolution: " + m.group(4));
			if (m.group(5) != null) {
				System.out.println("Year: " + m.group(5));
			}
		}
	}
}
Output:
--------
Name: Film The Film
Year: 2004
Resolution: 720p
--------
Name: Film The Film
Year: 2020
--------
Name: Film The Film
Resolution: 720p
Year: 2020
--------
Name: Film The Film
Resolution: 720p
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论