英文:
How should I interpret the output of tdda.rexpy.extract?
问题
我对Rexpy感兴趣,因为我正在寻找一个能推断匹配字符串的正则表达式的工具。通过使用rexpy.extract
和help
来检查,它看起来可能是我想要的。
extract(examples, tag=False, encoding=None, as_object=False, extra_letters=None, full_escape=False, remove_empties=False, strip=False, variableLengthFrags=False, max_patterns=None, min_diff_strings_per_pattern=1, min_strings_per_pattern=1, size=None, seed=None, dialect='portable', verbose=0)
从示例中提取正则表达式并返回。
通常示例应该是Unicode(即Python3中的“str”和Python2中的“unicode”)。但是,如果指定了编码,可以传递编码的字符串。
结果将始终为Unicode。
如果设置了as_object,则会返回提取器对象,其中结果在.results.rex中;否则,将返回一列正则表达式,作为Unicode字符串。
所以我尝试了一个例子:
>>> from tdda import rexpy
>>> s = 'andrew.gelman@statistics.com'
>>> rexpy.extract(s)
['^[.@]$', '^[a-z]$']
我期望得到类似于['^[a-z].[a-z]@[a-z].[a-z]$']
而不是['^[.@]$', '^[a-z]$']
。提取器只是告诉我特殊符号'.'
和'@'
在字符串中的某个位置被使用了吗?
英文:
I am interesting in Rexpy because I am looking for a tool which infers a regular expression that would match a string. Inspecting rexpy.extract
with help
it looked like it 'might' be what I want.
extract(examples, tag=False, encoding=None, as_object=False, extra_letters=None, full_escape=False, remove_empties=False, strip=False, variableLengthFrags=False, max_patterns=None, min_diff_strings_per_pattern=1, min_strings_per_pattern=1, size=None, seed=None, dialect='portable', verbose=0)
Extract regular expression(s) from examples and return them.
Normally, examples should be unicode (i.e. ``str`` in Python3,
and ``unicode`` in Python2). However, encoded strings can be
passed in provided the encoding is specified.
Results will always be unicode.
If as_object is set, the extractor object is returned,
with results in .results.rex; otherwise, a list of regular
expressions, as unicode strings is returned.
So I tried an example:
>>> from tdda import rexpy
>>> s = 'andrew.gelman@statistics.com'
>>> rexpy.extract(s)
['^[.@]$', '^[a-z]$']
I expected something similar to ['^[a-z].[a-z]@[a-z].[a-z]$']
rather than ['^[.@]$', '^[a-z]$']
. Is the extractor just telling me that special symbols '.'
and '@'
are used 'somewhere' in the string?
答案1
得分: 3
The examples
parameter expects an iterable of strings, by providing a single string as the parameter the function iterates over each individual character and is outputting regular expressions to match those single character examples.
尝试提供一个字符串列表,例如 rexpy.extract(
.
英文:
The examples
parameter expects an iterable of strings, by providing a single string as the parameter the function iterates over each individual character and is outputting regular expressions to match those single character examples.
Try providing a list of strings instead, e.g. rexpy.extract(
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论