2020年4月7日 13:17:59go评论68阅读模式

英文:

Java: How to replace consecutive characters with a single character?

问题

String fileContent = "def  mnop.UVW";
String oldDelimiters = " .";
String newDelimiter = "!";
for (int i = 0; i < oldDelimiters.length(); i++){
    Character character = oldDelimiters.charAt(i);
    fileContent = fileContent.replaceAll("[" + character + "]+", newDelimiter);
}

Current output: def!!mnop!UVW

Desired output: def!mnop!UVW

注意两个连续的空格被替换成了两个感叹号。如何将连续的分隔符替换为一个分隔符？

英文:

How can I replace consecutive characters with a single character in java?

String fileContent = &quot;def  mnop.UVW&quot;;
String oldDelimiters = &quot; .&quot;;
String newDelimiter = &quot;!&quot;;
for (int i = 0; i &lt; oldDelimiters.length(); i++){
    Character character = oldDelimiters.charAt(i);
    fileContent = fileContent.replace(String.valueOf(character), newDelimiter);
}

Current output: def!!mnop!UVW

Desired output: def!mnop!UVW

Notice the two spaces are replaced with two exclamation marks. How can I replace consecutive delimiters with one delimiter?

答案1

得分: 2

由于您想要匹配旧分隔符中的连续字符，正则表达式解决方案在这里似乎不可行。相反，您可以逐个字符进行匹配，如果它属于旧分隔符字符之一，然后将其设置为新字符，如下所示。

    import java.util.*;
    public class Main{
        public static void main(String[] args) {
            String fileContent = "def  mnop.UVW";
            String oldDelimiters = " .";
            
            // 将所有旧分隔符添加到集合中以进行快速检查
            Set<Character> set = new HashSet<>();
            for(int i=0; i<oldDelimiters.length(); ++i) set.add(oldDelimiters.charAt(i));
            
            /* 
               一次性匹配所有连续字符，检查它是否属于旧分隔符，
               并用新分隔符替换它
            */
            
            String newDelimiter = "!";
            StringBuilder res = new StringBuilder("");
            for(int i=0; i<fileContent.length(); ++i){
                if(set.contains(fileContent.charAt(i))){
                    while(i + 1 < fileContent.length() && fileContent.charAt(i) == fileContent.charAt(i+1)) i++;
                    res.append(newDelimiter);
                }else{
                    res.append(fileContent.charAt(i));        
                }
            }
            
            System.out.println(res.toString());
        }
    }

**演示:** https://onlinegdb.com/r1BC6qKP8

英文:

Since you want to match consecutive characters from the old delimiter, a regex solution doesn't seem to be feasible here. You can instead match char by char if it belongs to one of the old delimiter chars and then set it with the new one as shown below.

import java.util.*;
public class Main{
	public static void main(String[] args) {
	    String fileContent = &quot;def  mnop.UVW&quot;;
        String oldDelimiters = &quot; .&quot;;
        
        // add all old delimiters in a set for fast checks
        Set&lt;Character&gt; set = new HashSet&lt;&gt;();
        for(int i=0;i&lt;oldDelimiters.length();++i) set.add(oldDelimiters.charAt(i));
        
        /* 
           match all consecutive chars at once, check if it belongs to an old delimiter 
           and replace it with the new one
        */
        
        String newDelimiter = &quot;!&quot;;
        StringBuilder res = new StringBuilder(&quot;&quot;);
        for(int i=0;i&lt;fileContent.length();++i){
            if(set.contains(fileContent.charAt(i))){
                while(i + 1 &lt; fileContent.length() &amp;&amp; fileContent.charAt(i) == fileContent.charAt(i+1)) i++;
                res.append(newDelimiter);
            }else{
                res.append(fileContent.charAt(i));        
            }
        }
        
        System.out.println(res.toString());
	}
}

Demo: https://onlinegdb.com/r1BC6qKP8

答案2

得分: 1

以下是翻译好的内容：

使用正则表达式来实现这一点的最大难题是从你的 oldDelimiters 字符串创建一个表达式。例如：

String oldDelimiters = " .";
String expression = "\\" + String.join("+|\\", oldDelimiters.split("")) + "+";
String text = "def  mnop.UVW;abc .df";
String result = text.replaceAll(expression, "!");

(编辑：由于表达式中的字符现在已经被转义，我删除了字符类，并编辑了下面的文本以反映这一变化。)

生成的表达式看起来像是 \ +|\.+，即每个字符都被量化并构成表达式的一个替代。如果能匹配，引擎将逐个匹配和替换一个替代。result 现在包含：

def!mnop!UVW;abc!!df

由于在之前的 Java 版本中 split() 函数的行为（在空字符串上分割会产生前导空格），我不确定这在向后兼容方面会有多大问题，但在当前版本中，这应该是没问题的。

编辑：当前的实现会在定界字符包含数字或表示未转义正则表达式标记的字符（例如 1、b 等）时出现问题。

英文:

The biggest difficulty to using a regex for this, is to create an expression from your oldDelimiters string. For example:

String oldDelimiters = &quot; .&quot;;
String expression = &quot;\\&quot; + String.join(&quot;+|\\&quot;, oldDelimiters.split(&quot;&quot;)) + &quot;+&quot;;
String text = &quot;def  mnop.UVW;abc .df&quot;;
String result = text.replaceAll(expression, &quot;!&quot;);

(Edit: since characters in the expression are now escaped anyway, I removed the character classes and edited the following text to reflect that change.)

Where the generated expression looks like \ +|\.+, i.e. each character is quantified and constitutes one alternative of the expression. The engine will match and replace one alternative at a time if it can be matched. result now contains:

def!mnop!UVW;abc!!df

Not sure how backwards compatible this is due to split() behaviour in previous versions of Java (producing a leading space in splitting on the empty string), but with current versions this should be fine.

Edit: As it is, this breaks if the delimiting characters contain digits or characters representing unescaped regex tokens (i.e. 1, b, etc.).

答案3

得分: 1

s = s.replaceAll("([ \\.])[ \\.]+", "$1");

或者如果只需要替换几个相同的分隔符：

s = s.replaceAll("([ \\.])\+", "$1");

- `[....]` 是一组可选字符
- 第一个 `(...)` 是第一组，`$1`
- `\\1` 是第一组的文本

英文:

s = s.replaceAll(&quot;([ \\.])[ \\.]+&quot;, &quot;$1&quot;);

Or if only several same delimiters have to be replaced:

s = s.replaceAll(&quot;([ \\.])\+&quot;, &quot;$1&quot;);

[....] is a group of alternative characters
First (...) is group 1, $1
\\1 is the text of the first group

答案4

得分: 1

虽然没有使用正则表达式，但我认为使用StreamS来解决是必要的，因为每个人都喜欢流：

private static class StatefulFilter implements Predicate<String> {
    private final String needle;
    private String last = null;
    
    public StatefulFilter(String needle) {
        this.needle = needle;
    }
    
    @Override
    public boolean test(String value) {
        boolean duplicate = last != null && last.equals(value) && value.equals(needle);
        last = value;
        return !duplicate;
    }
}

public static void main(String[] args) {
    System.out.println(
        "def  mnop.UVW"
        .codePoints()
        .sequential()
        .mapToObj(c -> String.valueOf((char) c))
        .filter(new StatefulFilter(" "))
        .map(x -> x.equals(" ") ? "!" : x)
        .collect(Collectors.joining(""))
    );
}

可运行示例：https://onlinegdb.com/BkY0R2twU

解释：

从理论上讲，你实际上不应该有一个有状态的过滤器，但从技术上讲，只要流不是并行化的，它就可以正常工作：

.codePoints() - 将String拆分为一个Stream

.sequential() - 由于我们关心字符的顺序，我们的Stream不能并行处理

.mapToObj(c -> String.valueOf((char) c)) - 如果转换为String，则过滤器中的比较更直观，但实际上并不是必需的

.filter(new StatefulFilter(" ")) - 在这里，我们过滤掉在另一个空格之后出现的任何空格

.map(x -> x.equals(" ") ? "!" : x) - 现在我们可以用感叹号替换剩下的空格

.collect(Collectors.joining("")) - 最后，我们可以将字符连接在一起以重新构建一个String

StatefulFilter本身非常直观 - 它检查：a) 是否有先前的字符，b) 先前的字符是否与当前字符相同，以及c) 当前字符是否为分隔符（空格）。仅当a、b和c都为true时，它才返回false（表示字符被删除）。

英文:

While not using regex, I thought a solution with StreamS was needed, because everyone loves streams:

private static class StatefulFilter implements Predicate&lt;String&gt; {
	private final String needle;
	private String last = null;
	
	public StatefulFilter(String needle) {
		this.needle = needle;
	}
	
	@Override
	public boolean test(String value) {
		boolean duplicate = last != null &amp;&amp; last.equals(value) &amp;&amp; value.equals(needle);
		last = value;
		return !duplicate;
	}
}

public static void main(String[] args) {
	System.out.println(
		&quot;def  mnop.UVW&quot;
		.codePoints()
		.sequential()
		.mapToObj(c -&gt; String.valueOf((char) c))
		.filter(new StatefulFilter(&quot; &quot;))
		.map(x -&gt; x.equals(&quot; &quot;) ? &quot;!&quot; : x)
		.collect(Collectors.joining(&quot;&quot;))
	);
}

Runnable example: https://onlinegdb.com/BkY0R2twU

Explanation:

Theoretically, you aren't really supposed to have a stateful filter, but technically, as long as the stream is not parallelized, it works fine:

.codePoints() - splits the String into a Stream

.sequential() - since we care about the order of characters, our Stream may not be processed in parallel

.mapToObj(c -> String.valueOf((char) c)) - the comparison in the filter is more intuitive if we convert to String, but it's not really needed

.filter(new StatefulFilter(" ")) - here we filter out any space that comes after another space

.map(x -> x.equals(" ") ? "!" : x) - now we can replace the remaining spaces with exclamation marks

.collect(Collectors.joining("")) - and finally we can join the characters together to reconstitute a String

The StatefulFilter itself is pretty straight forward - it checks whether a) we have a previous character at all, b) whether the previous character is the same as the current character and c) whether the current character is the delimiter (space). It returns false (meaning the character gets deleted) only if all a, b and c are true.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Java：如何将连续的字符替换为单个字符？

问题

答案1

答案2

答案3

答案4

使用Apache POI库的Java代码将多个Excel文件合并成一个Excel文件。

不可预测的初始化块行为

为什么这个在布尔值中一直返回 false？

关于代码行的问题

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论