2020年9月27日 21:23:16go评论132阅读模式

英文:

Regex ignore tokens that do not start with letter

问题

如何编写一个正则表达式，可以忽略不以字母开头的任何标记？它应该在Java中使用。

示例：it 's super cool --> 正则表达式应匹配：[it, super, cool]，并忽略['s]。

英文:

how can I write a regex that ignores any token that does not start with a letter? it should be used in java.

example: it 's super cool --> regex should match: [it, super, cool] and ignore ['s].

答案1

得分: 0

你可以使用(?<!\\p{Punct})(\\p{L}+)，这表示不在标点符号前面的字母。注意，(?<!用于指定负向后查看。详细了解请查阅Pattern的文档。

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
    public static void main(String[] args) {
        String str = "it's super cool";
        Pattern pattern = Pattern.compile("(?<!\\p{Punct})(\\p{L}+)");
        Matcher matcher = pattern.matcher(str);
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}

输出结果:

it
super
cool

英文:

You can use (?<!\\p{Punct})(\\p{L}+) which means letters not preceded by a punctuation mark. Note that (?<! is used to specify a negative look behind. Check the documentation of Pattern to learn more about it.

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
	public static void main(String[] args) {
		String str = &quot;it &#39;s super cool&quot;;
		Pattern pattern = Pattern.compile(&quot;(?&lt;!\\p{Punct})(\\p{L}+)&quot;);
		Matcher matcher = pattern.matcher(str);
		while (matcher.find()) {
			System.out.println(matcher.group());
		}
	}
}

Output:

it
super
cool

答案2

得分: 0

替代正则表达式:

"(?:^|\\s)([A-Za-z]+)"

上下文中的正则表达式:

public static void main(String[] args) {
    String input = "it's super cool";
    Matcher matcher = Pattern.compile("(?:^|\\s)([A-Za-z]+)").matcher(input);
    while (matcher.find()) {
        String result = matcher.group(1);
        System.out.println(result);
    }
}

输出:

it
super
cool

注意: 若要匹配任何语言（如印地语、德语、中文、英语等）中的字母字符，请使用以下正则表达式：

"(?:^|\\s)(\\p{L}+)"

有关Pattern类以及Unicode脚本、块、类别和二进制属性的类的更多信息，请参见此处。

英文:

Alternative regex:

&quot;(?:^|\\s)([A-Za-z]+)&quot;

Regex in context:

public static void main(String[] args) {
    String input = &quot;it &#39;s super cool&quot;;
    Matcher matcher = Pattern.compile(&quot;(?:^|\\s)([A-Za-z]+)&quot;).matcher(input);
    while (matcher.find()) {
        String result = matcher.group(1);
        System.out.println(result);
    }
}

Output:

it
super
cool

Note: To match alphabetic characters, letters, in any language (e.g. Hindi, German, Chinese, English etc.), use the following regex instead:

&quot;(?:^|\\s)(\\p{L}+)&quot;

More about the class, Pattern and the classes for Unicode scripts, blocks, categories and binary properties, can be found here.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

正则表达式忽略不以字母开头的标记

问题

答案1

答案2

JSP页面如何找到正确的参数设置器

ANTLR / java / SDK 生成-编译-执行序列在 Windows10 命令窗口上失败。

Compiling Java sources with Clojure deps.edn.

有没有办法分别为saveAll或updateAll应用Spring缓存功能，使用Cacheable函数？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。