How to divide a Java String into two where the first substring is no longer than x and ends with a whole word

huangapple go评论72阅读模式
英文:

How to divide a Java String into two where the first substring is no longer than x and ends with a whole word

问题

我在将一个字符串分成两个子字符串时感到困惑。第一个子字符串的长度不应超过35,且应以单词结束。因此,如果限制的35个字符正好在单词中间,那么在这个单词开头时(假设在第32个字符处)就应该断开字符串。这里的“单词”是指任何非空格字符的组合。单词之间由空格分隔。第二个子字符串的长度可以任意,并且因此应该以单词开头。字符串的长度总是大于35且没有固定的模式。我该如何实现呢?提前谢谢!

示例:
>"Lordem ipsum dolor sit amet, $200 cons(35个字符,直到此处)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."

这是一个很长的字符串。然后我需要获取两个字符串:"Lordem ipsum dolor sit amet, $200"(少于35个字符且以单词结束),以及剩余部分作为一个大的单独子字符串。

英文:

I'm at a loss with dividing a string into 2 substrings. The first substring's length should be no more than 35 and it should end with the end of the word. So, if the 35 limit falls mid-word, then break the string when this word starts (let's say on 32). by word I mean any combo of non-space characters. words are divided by spaces. The second substring can be of any length and, consequently, should start with the start of a word. The string is always bigger than 35 and doesn't have a pattern. How can I implement it? Thanks in advance!

Example:
>"Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."

This is a long String. Then i need to get to strings : "Lordem ipsum dolor sit amet, $200" (fewer than 35 and ends where word ends) and the rest into one big separate substring

答案1

得分: 1

你可以使用StringTokenizer:

import java.util.Arrays;
import java.util.StringTokenizer;

public class Test {

    public static void main(String[] args){
        String str = "Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.";
        StringTokenizer strToken  = new StringTokenizer(str," ",true);
        String first  = "";
        String second = "";

        while(strToken.hasMoreTokens()){
            String next = strToken.nextToken();
            if((first+next).length() < 35){
                first += next;
            }
            else{
                break;
            }
            second = str.substring(first.length());
        }
        System.out.println(first);
        System.out.println(second);
    }
}

或者如果你在Java 9或更高版本,并且想尝试使用流:

import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;

public class Test {

    public static void main(String[] args){
        String str = "Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.";

        //按空格拆分并保留分隔符
        String[] splited = str.split("((?<= )|(?= ))");

        AtomicInteger ai = new AtomicInteger(0);
        String f = Arrays.stream(splited).takeWhile(i -> ai.addAndGet(i.length()) < 35).collect(Collectors.joining());

        AtomicInteger bi = new AtomicInteger(0);
        String s = Arrays.stream(splited).dropWhile(i -> bi.addAndGet(i.length()) < 35).collect(Collectors.joining());

        System.out.println(f);
        System.out.println(s);
    }
}
英文:

You can use StringTokenizer:

import java.util.Arrays;
import java.util.StringTokenizer;

public class Test {

    public static void main(String[] args){
        String str = &quot;Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.&quot;;
        StringTokenizer strToken  = new StringTokenizer(str,&quot; &quot;,true);
        String first  = &quot;&quot;;
        String second = &quot;&quot;;

        while(strToken.hasMoreTokens()){
            String next = strToken.nextToken();
            if((first+next).length() &lt; 35){
                first += next;
            }
            else{
                break;
            }
            second = str.substring(first.length());
        }
        System.out.println(first);
        System.out.println(second);
    }
}

Or if you are on java 9 or higher and want to try streams :

import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;

public class Test {

    public static void main(String[] args){
        String str = &quot;Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.&quot;;

        //split at space and keep delimiters
        String[] splited = str.split(&quot;((?&lt;= )|(?= ))&quot;);

        AtomicInteger ai = new AtomicInteger(0);
        String f = Arrays.stream(splited).takeWhile(i -&gt; ai.addAndGet(i.length()) &lt; 35).collect(Collectors.joining());

        AtomicInteger bi = new AtomicInteger(0);
        String s = Arrays.stream(splited).dropWhile(i -&gt; bi.addAndGet(i.length()) &lt; 35).collect(Collectors.joining());

        System.out.println(f);
        System.out.println(s);
    }
}

答案2

得分: -1

你可以使用以下方法,并以输入 35 来获得所需结果。

public static String[] splitAtLengthOrBeforeWord(String s, int length) {
    if (length < 0) {
        throw new IllegalArgumentException("length must be greater than 0");
    }

    if (s.length() < length) {
        return new String[] { s, "" };
    }
    
    for (int i = length - 1; i >= 0; i--) {
        int c = s.charAt(i);
        if (Character.isWhitespace(c)) {
            return new String[] { s.substring(0, i), s.substring(i) };
        }
    }
    return new String[] { "", s };
}
英文:

You can use the following approach with an input of 35 to get the desired result.

public static String[] splitAtLengthOrBeforeWord(String s, int length) {
    if(length &lt; 0) {
        throw new IllegalArgumentException(&quot;length must be greater than 0&quot;);
    }

    if(s.length() &lt; length) {
        return new String[] { s, &quot;&quot; };
    }
    
    for(int i = length - 1; i &gt;= 0; i--) {
        int c = s.charAt(i);
        if(Character.isWhitespace(c)) {
            return new String[] { s.substring(0, i), s.substring(i) };
        }
    }
    return new String[] { &quot;&quot;, s };
}

答案3

得分: -1

你可以使用字符串类的lastIndexOf方法,首先检查索引35处的字符是否为空格,如果是,就简单地进行拆分,否则可以在索引35处进行拆分,并获取最后一个空格的索引,该索引将给出单词的开头,这就是我们要解决的问题。下面的代码根据这个逻辑工作。您可以根据需要添加其他安全检查。

public static void main(String[] args) {
    String str = "Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.";
    String str1, str2 = "";
    if (str.charAt(35) == ' ') {
        str1 = str.substring(0, 35);
        str2 = str.substring(36, str.length());
    }
    else {
        String temp = str.substring(0, 35);
        int ind = temp.lastIndexOf(' ');
        str1 = str.substring(0, ind);
        str2 = str.substring(ind + 1, str.length());
    }
    System.out.println(str1);
    System.out.println(str2);
}
英文:

You can use lastindexOf method from string class, first check if character at 35 index is space just simple split else you can split on 35 and get last index of space that index will give you start of word and that is what we trying to figure out. Below is code working on this logic. You can add other safety checks as required.

    public static void main(String[] args) {
	String str = &quot;Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.&quot;;
	String str1, str2 = &quot;&quot;;
	if (str.charAt(35) == &#39; &#39;) {
		str1 = str.substring(0, 35);
		str2 = str.substring(36, str.length());
	}
	else {
		String temp = str.substring(0, 35);
		int ind = temp.lastIndexOf(&#39; &#39;);
		str1 = str.substring(0, ind);
		str2 = str.substring(ind + 1, str.length());
	}
	System.out.println(str1);
	System.out.println(str2);
   }

}

huangapple
  • 本文由 发表于 2020年10月16日 17:57:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/64386948.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定