如何仅在找到搜索文件内容模式时打印文件名。

huangapple go评论72阅读模式
英文:

How to print the file name only when a searched file content pattern is found

问题

a) 这是修改命令2以在文件内容之前也打印文件名,并在每个文件输出结果之间添加额外的换行符的方式:

find ./ -type f -name "*.tr" | xargs -I {} sed -n '/# unique_start_tag/,$p' {} | awk 'BEGIN {FS=":"} /^# unique_start_tag$/ {print FILENAME} !/^# unique_start_tag$/ {print}' RS= OFS="\n\n"

b) 这是一个替代方案,它使用文件中的"ID"行来提取文件名:

find ./ -type f -name "*.tr" | xargs awk -F' *: *' '/^ID/ {filename = $2} /^# unique_start_tag/ {print filename; flag=1} flag {print} /^# unique_end_tag/ {flag=0; print ""}'

c) 若要限制输出仅包含特定的"模式行"(例如,只包含"5"或"55"行),您可以使用以下命令:

find ./ -type f -name "*.tr" | xargs awk '/# unique_start_tag/,/# unique_end_tag/ {if ($0 ~ /^[0-9]+$/) print}'

d) 若要排除开始和结束标记从输出中,您可以使用以下命令:

find ./ -type f -name "*.tr" | xargs awk '/# unique_start_tag/,/# unique_end_tag/ {if ($0 !~ /^# unique_(start|end)_tag$/) print}'

请注意,这些命令在Git Bash终端中应该有效。如果您有任何其他问题,请随时提出。

英文:

I've a Linux search command working for a specific pattern within a block in multiple files.
But I'm missing the file name in the output for the files where the specific pattern was found. Since I could have 1000+ files this is a big problem for me, and I can't have all the file names that is checked printed, only the once where the pattern exists.

Update
Got some feedback. I have changed the id:s for the start and end tags within the block. Good to Know if you think some (old) answers look strange.

  • 4 => # unique_start_tag
  • 7 => # unique_end_tag

These files are created from a template so the tags Will always be there!

$ cat file1.tr
1
2
ID   : file1
3
# unique_start_tag
55
66
# unique_end_tag
8
9
$ cat file2.tr
1
2
ID   : file2
5
3
# unique_start_tag
5
6
# unique_end_tag
8
9
$ cat file3.tr
1
2
ID   : file3
5
3
# unique_start_tag
5
kalle
6
# unique_end_tag
8
9

What I tried (found after googling...):

1)

$ find ./ -type f -name "*.tr" | xargs sed -n '/# unique_start_tag/!b;:a;/# unique_end_tag/!{$!{N;ba}};{/55/p}'
# unique_start_tag
55
66
# unique_end_tag

2)

    $ find ./ -type f -name "*.tr" | xargs sed -n '/# unique_start_tag/!b;:a;/# unique_end_tag/!{$!{N;ba}};{/5/p}'
# unique_start_tag
55
66
# unique_end_tag
# unique_start_tag
5
6
# unique_end_tag
# unique_start_tag
5
kalle
6
# unique_end_tag

The "wildcard approach" in the output is okay. But since the file name is missing in the printout, it's pretty useless since it can be 1000+ files.

Question

a) How can the commands in "2" above be changed to also print the filename prior to the file content for files when there is a hit? (Extra line brake would be nice as well in-between each file output result)

So, what I would wish from "2" above:

file1.tr
# unique_start_tag
55
66
# unique_end_tag

file2.tr
# unique_start_tag
5
6
# unique_end_tag

file3.tr
# unique_start_tag
5
kalle
6
# unique_end_tag

b) Possible alternative to a? The files I'm dealing with have an "ID" line within the file that could be used to parse the "filename".
"ID : file1" is the line for file1. So, it is okay to parse that line instead as an alternative solution to "a". Since this ID maps against the real filename. How can it be solved, if you think it's easier than "a"? Example:

   $ find ./ -type f -name "*.tr" | xargs sed -n '/ID/p'
ID   : file1
ID   : file2
ID   : file3

But the general requirements is of course the same.

c) How can the printout be limited to only contain the specific "pattern line"? Only "5" respective "55" lines in the examples. To compare with the output seen in 1 and 2 above, where the complete block is printed!

d) How can the command be changed to exclude the start and end tags from the output?

All provided solution must work in Git bash terminal. And one-liners is good.
$ git version
git version 2.35.1.windows.2

答案1

得分: 0

这个awk版本可以吗?

find . -type f -name \*.tr | xargs -n 1 awk 'NR==1{print FILENAME} /4/,/7/{print} END{print ""}'
./file1.tr
4
55
66
7

./file3.tr
4
5
kalle
6
7

./file2.tr
4
5
6
7
英文:

Will this awk version do?

find . -type f -name \*.tr | xargs -n 1 awk 'NR==1{print FILENAME} /4/,/7/{print} END{print ""}' 
./file1.tr
4
55
66
7

./file3.tr
4
5
kalle
6
7

./file2.tr
4
5
6
7

答案2

得分: 0

只用 grep

$ ls f[1-3]
f1  f2  f3

$ cat f[1-3]
a
b
a

$ grep -ril a f[1-3]
f1
f3

如果你想解析文件并将它们添加到一个数组中,然后循环遍历它:

mapfile -t arr < <(grep -ril a f[1-3])

for i in "${arr[@]}"; do
    printf "$i "; grep a "$i"
done

f1 a
f3 a
英文:

Just grep it:

$ ls f[1-3]
f1  f2  f3

$ cat f[1-3]
a
b
a

$ grep -ril a f[1-3]
f1
f3

And if you want to parse files add them to an array and loop over it:

mapfile -t arr &lt; &lt;(grep -ril a f[1-3])

for i in &quot;${arr[@]}&quot;; do
    printf &quot;$i &quot;; grep a &quot;$i&quot;
done

f1 a
f3 a

答案3

得分: 0

这可能适用于您(GNU sed):

sed -n &#39;/4/{:a;N;/7/!ba;/55/{F;G;p}}&#39; 文件1 文件2 文件3 文件n

收集两个正则表达式之间的行,然后测试另一个包含正则表达式。如果匹配成功,输出当前文件名(F),添加空行(G)并打印(p)匹配的范围。

如果只需要第一个匹配,请添加q命令,即

sed -n &#39;/4/{:a;N;/7/!ba;/55/{F;G;p;q}}&#39; 文件1 文件2 文件3 文件n
英文:

This might work for you (GNU sed):

sed -n &#39;/4/{:a;N;/7/!ba;/55/{F;G;p}}&#39; file1 file2 file3 filen

Gather up the lines between two regexps and then test for another inclusive regexp. If the match is true, output the current filename (F), add an empty line (G) and print (p) the matched range.

If only the first match is required add the q command i.e

sed -n &#39;/4/{:a;N;/7/!ba;/55/{F;G;p;q}}&#39; file1 file2 file3 filen

huangapple
  • 本文由 发表于 2023年6月2日 11:53:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76387011.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定