英文:
Split string and extract text and number
问题
以下是您要的翻译内容:
我必须将地址分割为街道和编号。示例:
Lievensberg 31D
Jablunkovska 21/2
Weimarstraat 113 A
Pastoor Baltesenstraat 22
Van Musschenboek strasse 84
我需要这样分割:
街道1:Lievensberg
编号1:31D
街道2:Jablunkovska
编号2:21/2
街道3:Weimarstraat
编号3:113 A
街道4:Pastoor Baltesenstraat
编号4:22
街道5:Van Musschenboek strasse
编号5:84
我使用了以下代码,但不起作用,因为我只需要在空格后面的字符是数字时才进行分割:
String[] arrSplit = address_line.split("\\s");
for (int i = 0; i < arrSplit.length; i++) {
System.out.println(arrSplit[i]);
}
但我不知道如何做到满足我所有的要求。有任何想法吗?
英文:
I have to divide an address into street and number. Examples
Lievensberg 31D
Jablunkovska 21/2
Weimarstraat 113 A
Pastoor Baltesenstraat 22
Van Musschenboek strasse 84
I need to split like this:
Street1: Lievensberg
Number1: 31D
Street2: Jablunkovska
Number2: 21/2
Street3: Weimarstraat
Number3: 113 A
Street4: Pastoor Baltesenstraat
Number4: 22
Street5: Van Musschenboek strasse
Number5: 84
I used this code but not working, because I need to split only when the character after the white space will be a number:
String[] arrSplit = address_line.split("\\s");
for (int i = 0; i < arrSplit.length; i++) {
System.out.println(arrSplit[i]);
}
But I don't know how to do it so that all my requirements are met. Any idea?
答案1
得分: 2
如果数字是可选的,可以使用两个捕获组,其中第二个组是可选的。
^([^\d\r\n]+?)(?:\h*(\d.*)|$)
解释
^
字符串开始([^\d\r\n]+?)
匹配1个或多个字符,但不包括数字或换行符,非贪婪模式(?:
非捕获组\h*(\d.*)
匹配0个或多个水平空白字符|
或者$
字符串结束
)
关闭非捕获组
示例代码
String regex = "^([^\\d\\r\\n]+?)(?:\\h*(\\d.*)|$);"
String string = "Lievensberg 31D\n"
+ "Jablunkovska 21/2\n"
+ "Weimarstraat 113 A\n"
+ "Pastoor Baltesenstraat 22\n"
+ "Van Musschenboek strasse 84\n"
+ "Lievensberg";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Street: " + matcher.group(1));
if (matcher.group(2) != null) {
System.out.println("Number: " + matcher.group(2));
}
System.out.println("------------------");
}
输出
Street: Lievensberg
Number: 31D
------------------
Street: Jablunkovska
Number: 21/2
------------------
Street: Weimarstraat
Number: 113 A
------------------
Street: Pastoor Baltesenstraat
Number: 22
------------------
Street: Van Musschenboek strasse
Number: 84
------------------
Street: Lievensberg
------------------
英文:
If the number can be optional, instead of using split, you could use 2 capturing groups where the second group is optional.
^([^\d\r\n]+?)(?:\h*(\d.*)|$)
Explanation
^
Start of string([^\d\r\n]+?)
Match 1+ times any char except a digit or newline non greedy(?:
Non capture group\h*(\d.*)
Match 0+ horizontal whitespace chars|
Or$
End of string
)
Close non capture group
Example code
String regex = "^([^\\d\\r\\n]+?)(?:\\h*(\\d.*)|$)";
String string = "Lievensberg 31D\n"
+ "Jablunkovska 21/2\n"
+ "Weimarstraat 113 A\n"
+ "Pastoor Baltesenstraat 22\n"
+ "Van Musschenboek strasse 84\n"
+ "Lievensberg";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Street: " + matcher.group(1));
if (matcher.group(2) != null) {
System.out.println("Number: " + matcher.group(2));
}
System.out.println("------------------");
}
Output
Street: Lievensberg
Number: 31D
------------------
Street: Jablunkovska
Number: 21/2
------------------
Street: Weimarstraat
Number: 113 A
------------------
Street: Pastoor Baltesenstraat
Number: 22
------------------
Street: Van Musschenboek strasse
Number: 84
------------------
Street: Lievensberg
------------------
答案2
得分: 1
你可以使用正则表达式首先验证是否匹配,然后再进行处理。
String str1 = "Lievensberg 31D"; // 街道 = Lievensberg,号码 = 31D
String str2 = "Lievensberg NN31D"; // 不匹配
String str3 = "Lievensberg"; // 街道 = Lievensberg,号码 = null
String str4 = "Pastoor Baltesenstraat 22"; // 街道 = Pastoor Baltesenstraat,号码 = 22
Pattern pattern = Pattern.compile("([a-zA-Z ]+?)(\\s(\\d+)(.*))?");
Matcher matcher = pattern.matcher(str1);
if (matcher.matches()) {
String street = matcher.group(1);
String number = matcher.group(2) != null ? matcher.group(3) + matcher.group(4) : null;
System.out.println("街道 = " + street);
System.out.println("号码 = " + number);
}
英文:
You can use regex to verify first whether it matches or not, then only process it.
String str1 = "Lievensberg 31D"; // street = Lievensberg, number = 31D
String str2 = "Lievensberg NN31D"; // doesn't matches
String str3 = "Lievensberg"; // street = Lievensberg, number = null
String str4 = "Pastoor Baltesenstraat 22"; // street = Pastoor Baltesenstraat, number = 22
Pattern pattern = Pattern.compile("([a-zA-Z ]+?)(\\s(\\d+)(.*))?");
Matcher matcher = pattern.matcher(str1);
if(matcher.matches()) {
String street = matcher.group(1);
String number = matcher.group(2) != null ? matcher.group(3) + matcher.group(4) : null;
System.out.println("street = " + street);
System.out.println("number = " + number);
}
答案3
得分: 1
ArrayList<String> list = new ArrayList();
list.add("Lievensberg 31D");
list.add("Jablunkovska 21/2");
list.add("Weimarstraat 113 A");
list.add("Pastoor Baltesenstraat 22");
list.add("Van Musschenboek strasse 84");
for(int i=0;i<list.size();i++){
System.out.println("Street"+(i+1)+": "+ list.get(i).split("\\s+(?=\\d)")[0]);
System.out.println("Number"+(i+1)+": "+ list.get(i).split("\\s+(?=\\d)")[1]);
}
英文:
Something like this:
ArrayList<String> list = new ArrayList();
list.add("Lievensberg 31D");
list.add("Jablunkovska 21/2");
list.add("Weimarstraat 113 A");
list.add("Pastoor Baltesenstraat 22");
list.add("Van Musschenboek strasse 84");
for(int i=0;i<list.size();i++){
System.out.println("Street"+(i+1)+": "+ list.get(i).split("\\s+(?=\\d)")[0]);
System.out.println("Number"+(i+1)+": "+ list.get(i).split("\\s+(?=\\d)")[1]);
}
答案4
得分: 0
你可以使用这个逻辑:
- 找到第一个数字的索引
- 根据这个索引来分割字符串
为了更好地理解,使用以下代码:
public static void main(String[] args) {
String address_line = "Weimarstraat 113 A";
// 找到第一个数字的索引
Matcher matcher = Pattern.compile("\\d+").matcher(address_line);
int i = -1;
for(char c: address_line.toCharArray() ){
if('0'<=c && c<='9')
break;
i++;
}
// 使用索引分割字符串
System.out.println(address_line.substring(0, i));
System.out.println(address_line.substring(i+1));
}
其输出将是:
Weimarstraat
113 A
英文:
You can use this logic:
- Find the index of the first number
- Split the string based on this index
For better understanding use below code
public static void main(String[] args) {
String address_line = "Weimarstraat 113 A";
// Find index of first number
Matcher matcher = Pattern.compile("\\d+").matcher(address_line);
int i = -1;
for(char c: address_line.toCharArray() ){
if('0'<=c && c<='9')
break;
i++;
}
//Split string using index
System.out.println(address_line.substring(0, i));
System.out.println(address_line.substring(i+1));
}
Its output will be:
Weimarstraat
113 A
答案5
得分: -1
这是使用正则表达式和分割的简单解决方案:
String str = "Jablunkovska 21/2";
String[] split = str.split("\\s(?=\\d)", 2);
System.out.println(Arrays.toString(split));
输出:
[Jablunkovska, 21/2]
表达式 (?=\\d)
是一个数字的前瞻,因此它不会在分割时被移除。
英文:
Here's a simple solution using regex and split:
String str = "Jablunkovska 21/2";
String[] split = str.split("\\s(?=\\d)", 2);
System.out.println(Arrays.toString(split));
Output:
[Jablunkovska, 21/2]
The expression (?=\\d)
is a lookahead for a digit, so it doesn't get removed with the split.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论