Java:如何将连续的字符替换为单个字符?

huangapple go评论68阅读模式
英文:

Java: How to replace consecutive characters with a single character?

问题

String fileContent = "def  mnop.UVW";
String oldDelimiters = " .";
String newDelimiter = "!";
for (int i = 0; i < oldDelimiters.length(); i++){
    Character character = oldDelimiters.charAt(i);
    fileContent = fileContent.replaceAll("[" + character + "]+", newDelimiter);
}

Current output: def!!mnop!UVW

Desired output: def!mnop!UVW

注意两个连续的空格被替换成了两个感叹号。如何将连续的分隔符替换为一个分隔符?

英文:

How can I replace consecutive characters with a single character in java?

String fileContent = &quot;def  mnop.UVW&quot;;
String oldDelimiters = &quot; .&quot;;
String newDelimiter = &quot;!&quot;;
for (int i = 0; i &lt; oldDelimiters.length(); i++){
    Character character = oldDelimiters.charAt(i);
    fileContent = fileContent.replace(String.valueOf(character), newDelimiter);
}

Current output: def!!mnop!UVW

Desired output: def!mnop!UVW

Notice the two spaces are replaced with two exclamation marks. How can I replace consecutive delimiters with one delimiter?

答案1

得分: 2

由于您想要匹配旧分隔符中的连续字符正则表达式解决方案在这里似乎不可行相反您可以逐个字符进行匹配如果它属于旧分隔符字符之一然后将其设置为新字符如下所示

    import java.util.*;
    public class Main{
        public static void main(String[] args) {
            String fileContent = "def  mnop.UVW";
            String oldDelimiters = " .";
            
            // 将所有旧分隔符添加到集合中以进行快速检查
            Set<Character> set = new HashSet<>();
            for(int i=0; i<oldDelimiters.length(); ++i) set.add(oldDelimiters.charAt(i));
            
            /* 
               一次性匹配所有连续字符,检查它是否属于旧分隔符,
               并用新分隔符替换它
            */
            
            String newDelimiter = "!";
            StringBuilder res = new StringBuilder("");
            for(int i=0; i<fileContent.length(); ++i){
                if(set.contains(fileContent.charAt(i))){
                    while(i + 1 < fileContent.length() && fileContent.charAt(i) == fileContent.charAt(i+1)) i++;
                    res.append(newDelimiter);
                }else{
                    res.append(fileContent.charAt(i));        
                }
            }
            
            System.out.println(res.toString());
        }
    }

**演示:** https://onlinegdb.com/r1BC6qKP8
英文:

Since you want to match consecutive characters from the old delimiter, a regex solution doesn't seem to be feasible here. You can instead match char by char if it belongs to one of the old delimiter chars and then set it with the new one as shown below.

import java.util.*;
public class Main{
	public static void main(String[] args) {
	    String fileContent = &quot;def  mnop.UVW&quot;;
        String oldDelimiters = &quot; .&quot;;
        
        // add all old delimiters in a set for fast checks
        Set&lt;Character&gt; set = new HashSet&lt;&gt;();
        for(int i=0;i&lt;oldDelimiters.length();++i) set.add(oldDelimiters.charAt(i));
        
        /* 
           match all consecutive chars at once, check if it belongs to an old delimiter 
           and replace it with the new one
        */
        
        String newDelimiter = &quot;!&quot;;
        StringBuilder res = new StringBuilder(&quot;&quot;);
        for(int i=0;i&lt;fileContent.length();++i){
            if(set.contains(fileContent.charAt(i))){
                while(i + 1 &lt; fileContent.length() &amp;&amp; fileContent.charAt(i) == fileContent.charAt(i+1)) i++;
                res.append(newDelimiter);
            }else{
                res.append(fileContent.charAt(i));        
            }
        }
        
        System.out.println(res.toString());
	}
}

Demo: https://onlinegdb.com/r1BC6qKP8

答案2

得分: 1

以下是翻译好的内容:

使用正则表达式来实现这一点的最大难题是从你的 oldDelimiters 字符串创建一个表达式。例如:

String oldDelimiters = " .";
String expression = "\\" + String.join("+|\\", oldDelimiters.split("")) + "+";
String text = "def  mnop.UVW;abc .df";
String result = text.replaceAll(expression, "!");

(编辑:由于表达式中的字符现在已经被转义,我删除了字符类,并编辑了下面的文本以反映这一变化。)

生成的表达式看起来像是 \ +|\.+,即每个字符都被量化并构成表达式的一个替代。如果能匹配,引擎将逐个匹配和替换一个替代。result 现在包含:

def!mnop!UVW;abc!!df

由于在之前的 Java 版本中 split() 函数的行为(在空字符串上分割会产生前导空格),我不确定这在向后兼容方面会有多大问题,但在当前版本中,这应该是没问题的。

编辑:当前的实现会在定界字符包含数字或表示未转义正则表达式标记的字符(例如 1b 等)时出现问题。

英文:

The biggest difficulty to using a regex for this, is to create an expression from your oldDelimiters string. For example:

String oldDelimiters = &quot; .&quot;;
String expression = &quot;\\&quot; + String.join(&quot;+|\\&quot;, oldDelimiters.split(&quot;&quot;)) + &quot;+&quot;;
String text = &quot;def  mnop.UVW;abc .df&quot;;
String result = text.replaceAll(expression, &quot;!&quot;);

(Edit: since characters in the expression are now escaped anyway, I removed the character classes and edited the following text to reflect that change.)

Where the generated expression looks like \ +|\.+, i.e. each character is quantified and constitutes one alternative of the expression. The engine will match and replace one alternative at a time if it can be matched. result now contains:

def!mnop!UVW;abc!!df

Not sure how backwards compatible this is due to split() behaviour in previous versions of Java (producing a leading space in splitting on the empty string), but with current versions this should be fine.

Edit: As it is, this breaks if the delimiting characters contain digits or characters representing unescaped regex tokens (i.e. 1, b, etc.).

答案3

得分: 1

s = s.replaceAll("([ \\.])[ \\.]+", "$1");

或者如果只需要替换几个相同的分隔符

s = s.replaceAll("([ \\.])\+", "$1");

- `[....]` 是一组可选字符
- 第一个 `(...)` 是第一组,`$1`
- `\\1` 是第一组的文本
英文:
s = s.replaceAll(&quot;([ \\.])[ \\.]+&quot;, &quot;$1&quot;);

Or if only several same delimiters have to be replaced:

s = s.replaceAll(&quot;([ \\.])\+&quot;, &quot;$1&quot;);
  • [....] is a group of alternative characters
  • First (...) is group 1, $1
  • \\1 is the text of the first group

答案4

得分: 1

虽然没有使用正则表达式,但我认为使用StreamS来解决是必要的,因为每个人都喜欢流:

private static class StatefulFilter implements Predicate<String> {
    private final String needle;
    private String last = null;
    
    public StatefulFilter(String needle) {
        this.needle = needle;
    }
    
    @Override
    public boolean test(String value) {
        boolean duplicate = last != null && last.equals(value) && value.equals(needle);
        last = value;
        return !duplicate;
    }
}

public static void main(String[] args) {
    System.out.println(
        "def  mnop.UVW"
        .codePoints()
        .sequential()
        .mapToObj(c -> String.valueOf((char) c))
        .filter(new StatefulFilter(" "))
        .map(x -> x.equals(" ") ? "!" : x)
        .collect(Collectors.joining(""))
    );
}

可运行示例:https://onlinegdb.com/BkY0R2twU

解释:

从理论上讲,你实际上不应该有一个有状态的过滤器,但从技术上讲,只要流不是并行化的,它就可以正常工作:

.codePoints() - 将String拆分为一个Stream

.sequential() - 由于我们关心字符的顺序,我们的Stream不能并行处理

.mapToObj(c -> String.valueOf((char) c)) - 如果转换为String,则过滤器中的比较更直观,但实际上并不是必需的

.filter(new StatefulFilter(" ")) - 在这里,我们过滤掉在另一个空格之后出现的任何空格

.map(x -> x.equals(" ") ? "!" : x) - 现在我们可以用感叹号替换剩下的空格

.collect(Collectors.joining("")) - 最后,我们可以将字符连接在一起以重新构建一个String

StatefulFilter本身非常直观 - 它检查:a) 是否有先前的字符,b) 先前的字符是否与当前字符相同,以及c) 当前字符是否为分隔符(空格)。仅当a、b和c都为true时,它才返回false(表示字符被删除)。

英文:

While not using regex, I thought a solution with StreamS was needed, because everyone loves streams:

private static class StatefulFilter implements Predicate&lt;String&gt; {
	private final String needle;
	private String last = null;
	
	public StatefulFilter(String needle) {
		this.needle = needle;
	}
	
	@Override
	public boolean test(String value) {
		boolean duplicate = last != null &amp;&amp; last.equals(value) &amp;&amp; value.equals(needle);
		last = value;
		return !duplicate;
	}
}

public static void main(String[] args) {
	System.out.println(
		&quot;def  mnop.UVW&quot;
		.codePoints()
		.sequential()
		.mapToObj(c -&gt; String.valueOf((char) c))
		.filter(new StatefulFilter(&quot; &quot;))
		.map(x -&gt; x.equals(&quot; &quot;) ? &quot;!&quot; : x)
		.collect(Collectors.joining(&quot;&quot;))
	);
}

Runnable example: https://onlinegdb.com/BkY0R2twU

Explanation:

Theoretically, you aren't really supposed to have a stateful filter, but technically, as long as the stream is not parallelized, it works fine:

.codePoints() - splits the String into a Stream

.sequential() - since we care about the order of characters, our Stream may not be processed in parallel

.mapToObj(c -&gt; String.valueOf((char) c)) - the comparison in the filter is more intuitive if we convert to String, but it's not really needed

.filter(new StatefulFilter(&quot; &quot;)) - here we filter out any space that comes after another space

.map(x -&gt; x.equals(&quot; &quot;) ? &quot;!&quot; : x) - now we can replace the remaining spaces with exclamation marks

.collect(Collectors.joining(&quot;&quot;)) - and finally we can join the characters together to reconstitute a String

The StatefulFilter itself is pretty straight forward - it checks whether a) we have a previous character at all, b) whether the previous character is the same as the current character and c) whether the current character is the delimiter (space). It returns false (meaning the character gets deleted) only if all a, b and c are true.

huangapple
  • 本文由 发表于 2020年4月7日 13:17:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/61073258.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定