英文:
Regex to match movie file
问题
Here's the translated code part:
我试图编写一些正则表达式来匹配文件中的电影标题。 正则表达式应该匹配所有示例文件中的标题。 我目前只能使用此正则表达式 `^(.+).(\d{4}p)` 来使其中一些工作。
我在Java中使用它,来自java.util.regex包
我希望它在电影文件格式为以下情况下能够工作:
- {标题} {年份} {分辨率} 等。
- {标题} {分辨率} {年份} 等。
- {标题} {分辨率} 等。
- {标题} {年份} 等。
- 当电影包含年份或仅为年份时,如电影:2012(2009)
**示例文件:**
```java
Film.2017.720p.BluRay.H264.AAC.mp4
Film.And.The.Film.2017.1080p.BluRay.x264.mp4
152.Seconds.2010.1080p.BluRay.x264.mp4
2015.2005.1080p.BluRay.x264.mp4
Java 代码:
public static void main(String[] args)
{
ArrayList<String> movies = new ArrayList<>();
movies.add("Film.2017.720p.BluRay.H264.AAC.mp4");
movies.add("Film.And.The.Film.2017.1080p.BluRay.x264.mp4");
movies.add("152.Seconds.2010.1080p.BluRay.x264.mp4");
movies.add("2015.2005.1080p.BluRay.x264.mp4");
for (String s : movies)
{
System.out.println("原始文件: \t" + s);
System.out.println("新文件: \t\t" + getTitleFromFile(s) + "\n");
}
}
private static String getTitleFromFile(String fileName)
{
Pattern pattern = Pattern.compile("^(.+).(\\d{4}p)");
Matcher m = pattern.matcher(fileName);
if (m.find())
{
return m.group();
}
else
{
return null;
}
}
实际输出:
原始文件: Film.2017.720p.BluRay.H264.AAC.mp4
新文件: null
原始文件: Film.And.The.Film.2017.1080p.BluRay.x264.mp4
新文件: null
原始文件: Film 2015 1080p BluRay x264 DTS.mp4
新文件: Film 2015 1080p
原始文件: Film.1080p.BrRip.x264.mp4
新文件: Film.1080p
预期输出:
原始文件: Film.2017.720p.BluRay.H264.AAC.mp4
新文件: Film
原始文件: Film.And.The.Film.2017.1080p.BluRay.x264.mp4
新文件: Film And The Film
原始文件: Film 2015 1080p BluRay x264 DTS.mp4
新文件: Film
原始文件: Film.1080p.BrRip.x264.mp4
新文件: Film
英文:
I was trying to write some regex to match the title of a movie from a file. The regex should match the title from all the example files. I can only get it to work for some of them currently with this regex ^(.+).(\d{4}p)
.
I am using this in Java from the package java.util.regex
I would like it to work when the movie file format is:
- {title} {year} {resolution} etc.
- {title} {resolution} {year} etc.
- {title} {resolution} etc.
- {title} {year} etc.
- when the movie contains a year or is just a year like the movie: 2012 (2009)
Example files:
Film.2017.720p.BluRay.H264.AAC.mp4
Film.And.The.Film.2017.1080p.BluRay.x264.mp4
152.Seconds.2010.1080p.BluRay.x264.mp4
2015.2005.1080p.BluRay.x264.mp4
Java code:
public static void main(String[] args)
{
ArrayList<String> movies = new ArrayList<>();
movies.add("Film.2017.720p.BluRay.H264.AAC.mp4");
movies.add("Film.And.The.Film.2017.1080p.BluRay.x264.mp4");
movies.add("152.Seconds.2010.1080p.BluRay.x264.mp4");
movies.add("2015.2005.1080p.BluRay.x264.mp4");
for (String s : movies)
{
System.out.println("original file: \t" + s);
System.out.println("new file: \t\t" + getTitleFromFile(s) + "\n");
}
}
private static String getTitleFromFile(String fileName)
{
Pattern pattern = Pattern.compile("^(.+).(\\d{4}p)");
Matcher m = pattern.matcher(fileName);
if (m.find())
{
return m.group();
}
else
{
return null;
}
}
Actual Output:
original file: Film.2017.720p.BluRay.H264.AAC.mp4
new file: null
original file: Film.And.The.Film.2017.1080p.BluRay.x264.mp4
new file: null
original file: Film 2015 1080p BluRay x264 DTS.mp4
new file: Film 2015 1080p
original file: Film.1080p.BrRip.x264.mp4
new file: Film.1080p
Expected Output:
original file: Film.2017.720p.BluRay.H264.AAC.mp4
new file: Film
original file: Film.And.The.Film.2017.1080p.BluRay.x264.mp4
new file: Film And The Film
original file: Film 2015 1080p BluRay x264 DTS.mp4
new file: Film
original file: Film.1080p.BrRip.x264.mp4
new file: Film
答案1
得分: 1
以下是代码部分的翻译:
List<String> strs = Arrays.asList(
"Film.The.Film.720p.BrRip.x264.BOKUTOX.mp4",
"Film.The.Film.2020.BrRip.x264.mp4",
"Film.The.Film.720p.2020.BrRip.x264.mp4",
"Film.The.Film.720p.BrRip.x264.mp4"
);
Pattern p = Pattern.compile("^(.*?)\\W(?:(\\d{4})(?:\\W(\\d+p)?)|(\\d+p)(?:\\W(\\d{4}))?)\\b");
for (String str : strs) {
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("\n--------\nName: " + m.group(1).replace(".", " "));
if (m.group(2) != null) {
System.out.println("Year: " + m.group(2));
if (m.group(3) != null) {
System.out.println("Resolution: " + m.group(3));
}
} else {
System.out.println("Resolution: " + m.group(4));
if (m.group(5) != null) {
System.out.println("Year: " + m.group(5));
}
}
}
}
希望这能帮助你。
英文:
You may use
^(.*?)\W(?:(\d{4})(?:\W(\d+p)?)|(\d+p)(?:\W(\d{4}))?)\b
See the regex demo.
Details
^
- start of string(.*?)
- Group 1: name, any 0 or more chars other than line break chars, as few as possible\W
- a non-word char(?:(\d{4})(?:\W(\d+p)?)|(\d+p)(?:\W(\d{4}))?)
- either of(\d{4})(?:\W(\d+p)?)
- Group 2 - four digits followed with an optional group matching a non-word char and then one or more digits andp
captured in Group 3|
- or(\d+p)(?:\W(\d{4}))?
- Group 4 - one or more digits andp
followed with an optional group matching a non-word char and then four digits captured in Group 5
\b
- word boundary
Java demo:
List<String> strs = Arrays.asList("Film.The.Film.720p.BrRip.x264.BOKUTOX.mp4",
"Film.The.Film.2020.BrRip.x264.mp4",
"Film.The.Film.720p.2020.BrRip.x264.mp4",
"Film.The.Film.720p.BrRip.x264.mp4");
Pattern p = Pattern.compile("^(.*?)\\W(?:(\\d{4})(?:\\W(\\d+p)?)|(\\d+p)(?:\\W(\\d{4}))?)\\b");
for (String str : strs) {
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("\n--------\nName: " + m.group(1).replace(".", " "));
if (m.group(2) != null) {
System.out.println("Year: " + m.group(2));
if (m.group(3) != null) {
System.out.println("Resolution: " + m.group(3));
}
}
else {
System.out.println("Resolution: " + m.group(4));
if (m.group(5) != null) {
System.out.println("Year: " + m.group(5));
}
}
}
}
Output:
--------
Name: Film The Film
Year: 2004
Resolution: 720p
--------
Name: Film The Film
Year: 2020
--------
Name: Film The Film
Resolution: 720p
Year: 2020
--------
Name: Film The Film
Resolution: 720p
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论