正则表达式,匹配单词和带有撇号的单词

huangapple go评论69阅读模式
英文:

Regex that match words and words with apostrophes

问题

抱歉,如果这是一个重复的帖子。我有一个方法,应该解析文件中的所有单词。一个单词应该只包含字母 a-z 和撇号。以下是我的代码片段:

public void loadInput(File fileName) throws IOException {
    try{
        Scanner sc = new Scanner(fileName);
        int numWords = 0;
        while(sc.hasNext("[A-Za-z\\']+")) {
            String word = sc.next().toLowerCase(); // 不区分大小写
            numWords++;
            System.out.println(word);
        }
        System.out.println("文本文件中的总单词数:" + numWords);
        sc.close();
    } catch (Exception e) {
        System.out.println("发生了错误");
    }
}

例如输入:

alice's conversations in it, 'and what is the use of a book,'
thought alice 'without pictures or conversation?'

它应该匹配所有单词,包括 alice's,但不包括 'without(应该只匹配单词 without)。

英文:

Sorry if this is a duplicate post. I have a method that is supposed to parse all the words in a file. A word should only consist of letters a-z and an apostrophe. Here's my code snippet:

public void loadInput(File fileName) throws IOException {
        try{
            Scanner sc = new Scanner(fileName);
            int numWords = 0;
            while(sc.hasNext("[A-Za-z\']+")) {
                String word = sc.next().toLowerCase(); // case-insenstive
                numWords++;
                System.out.println(word);
            }
            System.out.println("Total words in text file: " + numWords);
            sc.close();
        } catch (Exception e) {
            System.out.println("Error has occured");
        }

    }

As an example input:

alice's conversations in it, `and what is the use of a book,'
thought alice `without pictures or conversation?'

It should match all the words including alice's but not 'without (it should match only the word without)

答案1

得分: 0

你需要一个负向后顾

(?<!')[\w']+,查找[\w']+,前面不跟着'

请参阅https://regex101.com/r/ZfyerX/3。

英文:

You want a negative lookbehind.

(?&lt;!&#39;)[\w&#39;]+, find [\w&#39;]+ not preceed by a &#39;.

See https://regex101.com/r/ZfyerX/3

huangapple
  • 本文由 发表于 2020年8月26日 00:37:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/63583388.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定