处理不同格式的字符串并尝试使用Java对其数据进行分类

huangapple go评论79阅读模式
英文:

Dealing with differently formatted Strings and trying to categorize their data using Java

问题

我有一个电子表格的列,其中包含不同机构的营业时间数据。我正在尝试使用Java将这些数据存储在变量中(monHours,tuesHours等),但我遇到的问题是每个元素都有许多不同的格式。以下是一些变化的示例:

"Monday – Friday: 8am – 5pm, Saturday Closed, Sunday Closed"

"Mon 7:15 AM - 4:00 PM, Tue 7:15 AM - 4:00 PM, Wed 7:15 AM - 4:00 PM, Thu 7:15 AM - 4:00 PM, Fri Closed, Sat Closed, Sun Closed"

"Monday 4AM-8PM-Saturday 4AM-8PM, Sunday Closed"

"Mon Closed, Tue Closed, Wed 7 - 9:00 AM, Thu Closed, Fri Closed, Sat 7 - 9:00 AM, Sun Closed"

"24/7"

再次强调,我只是想在Java中找到一个算法,可以处理这些字符串中的任何一种,并将每天的营业时间存储在一个变量中。我还应该注意到,我不能将数据更改为一个统一的格式,因为有数百个条目。

我知道这可能与字符串方法有关,特别是子字符串。我已经想到去掉空格和逗号可能会有所帮助,但我总体上很难找到适用于所有情况的算法。

编辑

例如,如果我有一个字符串条目如下:

"Monday – Friday: 8am – 5pm, Saturday Closed, Sunday Closed"

我会创建一个字符串数组,如下所示:

["Monday: 8AM-5PM", "Tuesday: 8AM-5PM", "Wednesday: 8AM-5PM", "Thursday: 8AM-5PM", "Friday: 8AM-5PM", "Saturday: closed", "Sunday: closed"]

数组中每个元素的具体格式可能不同,但我想你明白我希望如何存储它。

英文:

I have a column of a spreadsheet that contains data on opening hours for different establishments. Using Java I am trying to store that data in variables (monHours, tuesHours, etc.) but the roadblock I am trying to figure out how to deal with is that there are many different ways each element is formatted. Here are some examples of variations:

"Monday – Friday: 8am – 5pm, Saturday Closed, Sunday Closed"

"Mon 7:15 AM - 4:00 PM, Tue 7:15 AM - 4:00 PM, Wed 7:15 AM - 4:00 PM, Thu 7:15 AM - 4:00 PM, Fri Closed, Sat Closed, Sun Closed"

"Monday 4AM-8PM-Saturday 4AM-8PM, Sunday Closed"

"Mon Closed, Tue Closed, Wed 7 - 9:00 AM, Thu Closed, Fri Closed, Sat 7 - 9:00 AM, Sun Closed"

"24/7"

Again, I am just trying to figure out an algorithm in Java that will take any of those strings and store the opening hours for each day in a variable. I should also note that I can't change the data to one singular format as there are hundreds of entries.

I know that it probably has something to do with string methods, specifically substrings. I have already figured that removing whitespace and commas would probably help, but I am overall struggling to find an algorithm that works for all cases.

EDIT

For example if I had an entry of a String like
> "Monday – Friday: 8am – 5pm, Saturday Closed, Sunday Closed"

I would create an array of Strings like

["Monday: 8AM-5PM", "Tuesday: 8AM-5PM", "Wednesday: 8AM-5PM", "Thursday: 8AM-5PM", "Friday: 8AM-5PM", "Saturday: closed", "Sunday: closed"].

The specific formatting of each element in the array may be different but I imagine you get the idea of how I wish to store it.

答案1

得分: 1

以下是代码的翻译部分:

String[] strings = {
    "Monday – Friday: 8am – 5pm, Saturday Closed, Sunday Closed",
    "Mon 7:15 AM - 4:00 PM, Tue 7:15 AM - 4:00 PM, Wed 7:15 AM - 4:00 PM, Thu 7:15 AM - 4:00 PM, Fri Closed, Sat Closed, Sun Closed",
    "Monday 4AM-8PM-Saturday 4AM-8PM, Sunday Closed",
    "Mon Closed, Tue Closed, Wed 7 - 9:00 AM, Thu Closed, Fri Closed, Sat 7 - 9:00 AM, Sun Closed",
    "24/7"
};
String[] values;
for (String string : strings) {
    values = string.split(", ");
    for (String value : values) System.out.println(value);
    System.out.println();
}

输出结果:

Monday – Friday: 8am – 5pm
Saturday Closed
Sunday Closed

Mon 7:15 AM - 4:00 PM
Tue 7:15 AM - 4:00 PM
Wed 7:15 AM - 4:00 PM
Thu 7:15 AM - 4:00 PM
Fri Closed
Sat Closed
Sun Closed

Monday 4AM-8PM-Saturday 4AM-8PM
Sunday Closed

Mon Closed
Tue Closed
Wed 7 - 9:00 AM
Thu Closed
Fri Closed
Sat 7 - 9:00 AM
Sun Closed

24/7

希望这有所帮助。

英文:

It's a complex parse.  You can start by delimiting the strings on the comma.

String[] strings = {
    "Monday  Friday: 8am  5pm, Saturday Closed, Sunday Closed",
    "Mon 7:15 AM - 4:00 PM, Tue 7:15 AM - 4:00 PM, Wed 7:15 AM - 4:00 PM, Thu 7:15 AM - 4:00 PM, Fri Closed, Sat Closed, Sun Closed",
    "Monday 4AM-8PM-Saturday 4AM-8PM, Sunday Closed",
    "Mon Closed, Tue Closed, Wed 7 - 9:00 AM, Thu Closed, Fri Closed, Sat 7 - 9:00 AM, Sun Closed",
    "24/7"
};
String[] values;
for (String string : strings) {
    values = string.split(", ");
    for (String value : values) System.out.println(value);
    System.out.println();
}

Output

Monday – Friday: 8am – 5pm
Saturday Closed
Sunday Closed

Mon 7:15 AM - 4:00 PM
Tue 7:15 AM - 4:00 PM
Wed 7:15 AM - 4:00 PM
Thu 7:15 AM - 4:00 PM
Fri Closed
Sat Closed
Sun Closed

Monday 4AM-8PM-Saturday 4AM-8PM
Sunday Closed

Mon Closed
Tue Closed
Wed 7 - 9:00 AM
Thu Closed
Fri Closed
Sat 7 - 9:00 AM
Sun Closed

24/7

> "Again, I am just trying to figure out an algorithm in Java that will take any of those strings and store the opening hours for each day in a variable"

You can use a regular expression pattern, although I wouldn't recommend it if the format is too arbitrary.

For example, this is specific to matching only the formats you've provided.

(?i)((sun|mon|tue|wed|thu|fri|sat)(?:(?:nes|rs|ur)?day)?:? (\d?\d(?::\d\d)?(?: ?[ap]m)?))
Pattern pattern = Pattern.compile("(?i)((sun|mon|tue|wed|thu|fri|sat)(?:(?:nes|rs|ur)?day)?:? (\\d?\\d(?::\\d\\d)?(?: ?[ap]m)?))");
Matcher matcher;
for (String string : strings) {
    matcher = pattern.matcher(string);
    while (matcher.find()) {
        System.out.println(matcher.group());
    }
}

Output

Friday: 8am
Mon 7:15 AM
Tue 7:15 AM
Wed 7:15 AM
Thu 7:15 AM
Monday 4AM
Saturday 4AM
Wed 7
Sat 7

答案2

得分: 0

我建议为每种预期格式实现一个方法。每个方法将尝试将字符串解析为相应的格式。如果解析成功 - 返回结果,否则抛出异常(或返回null,或任何您喜欢的方式)。然后编写与您预期的格式数量相同的方法,并依次执行它们,直到某个方法成功返回解析的值。

英文:

I would suggest to implement a method for each of expected formats. Each method would try to parse a string as it formatted in a corresponding way. If the parsing was successful - return the result, otherwise throws an exception (or return null, or whatever you like). Then write as many methods as there are formats you expect and execute them one by one until some method would return successfully parsed value.

答案3

得分: 0

Parsing is another issue, but I would create a proper data model to hold your opening times, perhaps something like the below:

解析是另一个问题,但我会创建一个合适的数据模型来保存您的营业时间,可能类似以下内容:

import java.time.*;
import java.util.*;

public class DaySchedule {
    record FromTo(LocalTime from, LocalTime to) {}
    public static void main(String[] args) {
        try {
            Map<DayOfWeek, List<FromTo>> openingTimes = new HashMap<>();
            // Make mutable List
            openingTimes.put(DayOfWeek.MONDAY, new ArrayList<>(List.of(new FromTo(LocalTime.of(14, 20), LocalTime.of(16,20)))));
            // Add an evening opening period to the short afternoon one
            openingTimes.get(DayOfWeek.MONDAY).add(new FromTo(LocalTime.of(17, 0), LocalTime.of(19, 0)));

            System.out.println(openingTimes);
        }
        catch(Throwable t) {
            t.printStackTrace();
        }
    }
}
英文:

Parsing is another issue, but I would create a proper data model to hold your opening times, perhaps something like the below:

import java.time.*;
import java.util.*;

public class DaySchedule {
    record FromTo(LocalTime from, LocalTime to) {}
    public static void main(String[] args) {
        try {
            Map&lt;DayOfWeek, List&lt;FromTo&gt;&gt; openingTimes = new HashMap&lt;&gt;();
            // Make mutable List
            openingTimes.put(DayOfWeek.MONDAY, new ArrayList&lt;&gt;(List.of(new FromTo(LocalTime.of(14, 20), LocalTime.of(16,20)))));
            // Add an evening opening period to the short afternoon one
            openingTimes.get(DayOfWeek.MONDAY).add(new FromTo(LocalTime.of(17, 0), LocalTime.of(19, 0)));

            System.out.println(openingTimes);
        }
        catch(Throwable t) {
            t.printStackTrace();
        }
    }
}

huangapple
  • 本文由 发表于 2023年6月12日 11:30:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76453497.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定