英文:
Skip folders and files during `grep`
问题
I have a folder (/test
) that contains several subfolders and files (at various levels). Inside /test
, I want to iterate only text files containing a particular string ($string
), ignoring files and entire directories specified in a file ($to_skip
) by path. This "exclusion list" contains both file and folder paths as shown below:
/test/ex1/fileA
/test/bob/ex1/fileB
/test/ex1/subfolder
/test/jerry/ex2
That is, one path per line. What I've done follows, but it didn't work (the various options are required for other reasons):
grep -nriFlI "$string" "/test" | grep -vFf "$to_skip" > "$out"
The 1st grep
actually gives me the correct list of paths (any text files containing $string
), but the 2nd grep
doesn't filter as expected since $out
contains all the items produced by the 1st grep
except the occurrences corresponding to the last pattern.
The strange thing is that it works correctly only for the last pattern. For example, if /test
, among the other data, has the following files/folders:
/test/jerry/ex2/f1
/test/jerry/ex2/f2
/test/jerry/ex2/foo/f3
/test/jerry/ex2/bar
and /test/jerry/ex2/
is the last pattern specified in $to_skip
, the paths above (and in general any data under /test/jerry/ex2/
) are being excluded correctly!
How can I achieve my target?
Many thanks!
英文:
I have a folder (/test
) that contains several subfolder and files (at various level). Inside /test
, I want to iterate only text files containing a particular string ($string
), ignoring files and entire directories specified in a file ($to_skip
) by path. This "exclusion list" contains both files and folder paths as shown below:
/test/ex1/fileA
/test/bob/ex1/fileB
/test/ex1/subfolder
/test/jerry/ex2
that is, one path per line. What I've done follows, but it didn't worked (the various options are required for other reasons):
grep -nriFlI "$string" "/test" | grep -vFf "$to_skip" > "$out"
The 1st grep
actually give me the correct list of paths (any text files containing $string
), but the 2nd grep
doesn't filter as expected since $out
contains all the items produced by the 1st grep
except the occurrences corresponding the last pattern.
The strange thing is that it works correctly only for the last pattern. For example, if /test
, among the others data, has the following files/folders
/test/jerry/ex2/f1
/test/jerry/ex2/f2
/test/jerry/ex2/foo/f3
/test/jerry/ex2/bar
and /test/jerry/ex2/
is the last pattern specified in $to_skip
, the paths above (and in general any data under /test/jerry/ex2/
) are being excluded correctly!
How can I achieve my target?
Many thanks!
答案1
得分: 0
根据Dominique的建议,将-prune
放到find
命令中将成为排除搜索结果中指定目录/文件的常用方法。请尝试以下内容:
#!/bin/bash
# 创建"find"选项,如"-path /test/foo -o -path /test/bar -o ..."
while IFS= read -r f; do
(( ${#opts[@]} > 0 )) && opts+=("-o") # 第二个参数及以后
opts+=("-path" "$f")
done < "$to_skip"
find /test \( "${opts[@]}" \) -prune -o -type f -exec grep -niFlI "$string" '{}' +
- 数组
opts
包含从"$to_skip"创建的排除列表。 \( .. \)
是为了优先级而需要的。-r
选项已被删除,因为参数列表已递归遍历文件并扩展。
英文:
As suggested by Dominique, putting -prune
to find
command will be
a common method to exclude specified dirs/files out of the search result.
Would you please try the following:
#!/bin/bash
# create the option to "find" such as "-path /test/foo -o -path /test/bar -o ..."
while IFS= read -r f; do
(( ${#opts[@]} > 0 )) && opts+=("-o") # 2nd argument and thereafter
opts+=("-path" "$f")
done < "$to_skip"
find /test \( "${opts[@]}" \) -prune -o -type f -exec grep -niFlI "$string" '{}' +
- The array
opts
contains the exclusion list created from "$to_skip" - The pair of
\( .. \)
is needed for the precedence. - The
-r
option togrep
is dropped because the argument list is already
expanded as recursively traversed files.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论