英文:
Alternative to 'find' which supports PCRE
问题
Linux的find命令不支持Perl兼容的正则表达式(PCRE)。
是否有替代方法可以做到这一点,而且使用简洁(一行命令行)?
我找到了一些一行命令,但它们又长又复杂,很难理解它们的功能,每次写它们都很麻烦。
示例:
使用管道、选项和多个函数。
https://stackoverflow.com/questions/19894673/unix-linux-freebsd-find-command-with-perl-regex
使用了许多选项和Perl。
我尝试直接使用Perl,但没有找到一个纯Perl的一行命令来实现。
示例:
提供了一个用于在单个文件中查找匹配项的一行命令,但不能在目录中查找文件名的匹配项。
英文:
Linux's find command does not support Perl compatible regular expressions (PCRE).
Is there an alternative that can do that that is concise to use (one line on command line).
I found some one liners but they were long and complicated, making it difficult to understand what they do and a pain to write them every time.
Examples:
uses pipelining, -options, and multiple functions.
https://stackoverflow.com/questions/19894673/unix-linux-freebsd-find-command-with-perl-regex
uses a lot of options and also Perl
I tried using Perl directly but didn't find a pure Perl one-liner for it.
Example:
Gives a one liner for finding matches within a single file. But does not find filename matches within a directory.
答案1
得分: 2
使用Perl的File::Find
这会递归查找当前目录及其子目录中以.pl
结尾的所有条目。
或者以目录作为输入,当前目录作为默认值
find( sub { say $File::Find::name if /\.pl$/ }, $d )' directory-name
或者将找到的所有文件组合起来,进行可能的后处理,写入文件等操作
find( sub { push @f, $File::Find::name if /\.pl$/ }, $d );
say for @f' directory-name
(如果没有提供目录名称,则使用当前目录)
然而,我不明白为什么不使用find
+ grep
的简单管道。grep
本身支持基本的正则表达式,而使用-E
可以支持扩展正则表达式,使用-P
(Perl)可以使用PCRE。因此,以下命令一次性完成所需操作:
find ... | grep -P regex
文件名的标准可以分为两部分,一部分用于find
的自身通配符,一部分用于grep
的正则表达式。
最后,问题要求使用PCRE,而find
的确没有PCRE正则表达式,如上所述。但是,find
支持其他正则表达式的风格。有关这些不同风格之间的差异的详细描述可以在Linux上使用info find
命令找到(我在互联网上找不到)。简而言之,与grep
和其他工具使用的PCRE相比,主要差异在于:1)正则表达式模式必须与整个路径匹配,而不仅仅是其中的子字符串,2)这非常基本。
因此,要查找文件名中具有字母和数字以及.txt
扩展名的文件,路径中可以有其他内容,可以在当前目录中的任何位置或其子目录中使用以下命令:
find . -type f -regex '.*\/[a-zA-Z]+[0-9]+\.txt'
请注意,前导的.*
是必需的,否则无法匹配到文件名本身所在的路径(至少包括./
)。尽管与完整的PCRE相比,这很基本,但对于大多数用途来说可能已经足够了。
英文:
Using Perl's File::Find
perl -MFile::Find -wE'find( sub { say $File::Find::name if /\.pl$/ }, q(.) )'
This finds all entries which end with .pl
, recursively anywehere under the current directory.
Or take the directory as input, with the current dir as default
perl -MFile::Find -wE' $d = shift//q(.);
find( sub { say $File::Find::name if /\.pl$/ }, $d )' directory-name
Or assemble all files found for some possible post-processing, writing to file etc
perl -MFile::Find -wE' $d = shift//q(.);
find( sub { push @f, $File::Find::name if /\.pl$/ }, $d );
say for @f' directory-name
(If run without a directory-name then the current directory is used)
However, I don't see why a simple pipeline of find
+ grep
isn't suitable. The grep
itself supports basic regex, while with -E
it supports extended ones and with -P
(Perl) it uses PCRE. So
find ... | grep -P regex
does exactly what is asked, in one command line. Criteria for filenames then can be split, some to go with find
's own globbing and some in grep
's regex.
Finally, the question asks for PCRE and find
indeed doesn't have PCRE regex, as stated. However, find
does support other flavors of regex. The man page has only a basic statement while a detailed description of differences between the flavors can be found with info find
command on Linux (what I couldn't find on internet).
In short, the main differences from PCRE as used by grep
and other tools, are: 1) the regex pattern has to match the whole path and not just a substring in it, and 2) this is very basic
So to find a file which has letters and then numbers before a .txt
extension in the filename, with anything else for the path, anywhere in or under the current directory
find . -type f -regex '.*\/[a-zA-Z]+[0-9]+\.txt'
Note that the leading .*
is necessary, otherwise the path leading to the filename itself can't be matched (there's at least ./
in it).
Basic as it is in comparison with the full PCRE, this may well be plenty enough for most uses.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论