英文:
How should I interpret the output of tdda.rexpy.extract?
问题
我对Rexpy感兴趣,因为我正在寻找一个能推断匹配字符串的正则表达式的工具。通过使用rexpy.extract和help来检查,它看起来可能是我想要的。
extract(examples, tag=False, encoding=None, as_object=False, extra_letters=None, full_escape=False, remove_empties=False, strip=False, variableLengthFrags=False, max_patterns=None, min_diff_strings_per_pattern=1, min_strings_per_pattern=1, size=None, seed=None, dialect='portable', verbose=0)
    从示例中提取正则表达式并返回。
    
    通常示例应该是Unicode(即Python3中的“str”和Python2中的“unicode”)。但是,如果指定了编码,可以传递编码的字符串。
    
    结果将始终为Unicode。
    
    如果设置了as_object,则会返回提取器对象,其中结果在.results.rex中;否则,将返回一列正则表达式,作为Unicode字符串。
所以我尝试了一个例子:
>>> from tdda import rexpy
>>> s = 'andrew.gelman@statistics.com'
>>> rexpy.extract(s)
['^[.@]$', '^[a-z]$']
我期望得到类似于['^[a-z].[a-z]@[a-z].[a-z]$']而不是['^[.@]$', '^[a-z]$']。提取器只是告诉我特殊符号'.'和'@'在字符串中的某个位置被使用了吗?
英文:
I am interesting in Rexpy because I am looking for a tool which infers a regular expression that would match a string. Inspecting rexpy.extract with help it looked like it 'might' be what I want.
extract(examples, tag=False, encoding=None, as_object=False, extra_letters=None, full_escape=False, remove_empties=False, strip=False, variableLengthFrags=False, max_patterns=None, min_diff_strings_per_pattern=1, min_strings_per_pattern=1, size=None, seed=None, dialect='portable', verbose=0)
    Extract regular expression(s) from examples and return them.
    
    Normally, examples should be unicode (i.e. ``str`` in Python3,
    and ``unicode`` in Python2). However, encoded strings can be
    passed in provided the encoding is specified.
    
    Results will always be unicode.
    
    If as_object is set, the extractor object is returned,
    with results in .results.rex; otherwise, a list of regular
    expressions, as unicode strings is returned.
So I tried an example:
>>> from tdda import rexpy
>>> s = 'andrew.gelman@statistics.com'
>>> rexpy.extract(s)
['^[.@]$', '^[a-z]$']
I expected something similar to ['^[a-z].[a-z]@[a-z].[a-z]$'] rather than ['^[.@]$', '^[a-z]$']. Is the extractor just telling me that special symbols '.' and '@' are used 'somewhere' in the string?
答案1
得分: 3
The examples parameter expects an iterable of strings, by providing a single string as the parameter the function iterates over each individual character and is outputting regular expressions to match those single character examples.
尝试提供一个字符串列表,例如 rexpy.extract(.
英文:
The examples parameter expects an iterable of strings, by providing a single string as the parameter the function iterates over each individual character and is outputting regular expressions to match those single character examples.
Try providing a list of strings instead, e.g. rexpy.extract(.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论