英文:
Print line containing pattern preceded by different line containing a different pattern
问题
macOS 13.3 Ventura因此BSD版本的grep、awk等等。
如何搜索并打印包含特定模式的行,其中该行必须由包含不同模式的不同行先导?
文本包含类似以下的行(用大写字母作为参考,...==无关字符)。
... # 一些不确定数量的行。
A ... "model" = \< ...
... # 一些不确定数量的行。
B ... PXS4@0 ...
... # 一些不确定数量的行。
C ... "model" = \< ...
... # 一些不确定数量的行。
D ... PXS2@0 ...
... # 一些不确定数量的行。
E ... "model" = \< ...
... # 一些不确定数量的行。
F ... PXS1@0 ...
... # 一些不确定数量的行。
G ... "model" = \< ...
... # 一些不确定数量的行。
H ... "model" = \< ...
... # 一些不确定数量的行。
只有包含“model”并由包含不同模式的PXS[[:digit:]]@0的行应该出现:
C ... "model" = \< ...
E ... "model" = \< ...
G ... "model" = \< ...
据我所知,macOS的awk和grep不支持回顾先行和前瞻。
我认为以下命令会找到匹配PXS...,然后找到并打印model...,但它会打印行“A”:
awk '/(PXS\[\[:digit:\]\]@0 )+?model" = \</ { print }'
以下命令也接近,但会打印行“A”。由于它打印了“A”,我不明白为什么它不会打印“H”。
grep -e ".\*PXS\[\[:digit:\]\]@0 " -e ".\*model" = \<"" | grep -v -e ".\*PXS\[\[:digit:\]\]@0 "
请 enlighten me!
英文:
macOS 13.3 Ventura hence BSD versions of grep, awk, et al.
How do I search for and print a line containing a pattern where the line MUST be preceded by a different line containing a different pattern?
The text contains lines like these (leading CAPS for reference, ...==irrelevant chars).
... # An indeterminate number of lines.
A ... "model" = \< ...
... # An indeterminate number of lines.
B ... PXS4@0 ...
... # An indeterminate number of lines.
C ... "model" = \< ...
... # An indeterminate number of lines.
D ... PXS2@0 ...
... # An indeterminate number of lines.
E ... "model" = \< ...
... # An indeterminate number of lines.
F ... PXS1@0 ...
... # An indeterminate number of lines.
G ... "model" = \< ...
... # An indeterminate number of lines.
H ... "model" = \< ...
... # An indeterminate number of lines.
ONLY lines with "model" that are preceded by a line with PXS[[:digit:]]@0 should appear:
C ... "model" = \< ...
E ... "model" = \< ...
G ... "model" = \< ...
AFAICT the regex in macOS's awk & grep do not support look-behind and look-ahead.
I thought this would find a match of PXS... and then find/print model... but it prints line "A":
awk '/(PXS\[\[:digit:\]\]@0 )+?model" = \</ { print }'
This also comes close but prints line "A". Since it prints "A" I don't understand why it doesn't also print "H".
grep -e ".\*PXS\[\[:digit:\]\]@0 " -e ".\*model" = \<"" | grep -v -e ".\*PXS\[\[:digit:\]\]@0 "
Enlighten me please!
答案1
得分: 0
MacOs
perl -n0e 'print $_ =~ /PXS[[:digit:]]@0.*\n.*\n/g' 文件名 | perl -p -e 's/PXS[[:digit:]]@0[^\n]*\n//g'
Linux
grep -zoP 'PXS[[:digit:]]@0.*\n.*\n' 文件名 | sed -z -E 's/PXS[[:digit:]]@0[^\n]*\n//g'
英文:
MacOs
perl -n0e 'print $_ =~ /PXS[[:digit:]]@0.*\n.*\n/g' filename | perl -p -e 's/PXS[[:digit:]]@0[^\n]*\n//g'
First step: leave only lines with PXS[[:digit:]]@0
and the next lines.
Second step: remove lines with PXS[[:digit:]]@0
Linux
grep -zoP 'PXS[[:digit:]]@0.*\n.*\n' filename | sed -z -E 's/PXS[[:digit:]]@0[^\n]*\n//g'
Grep to find lines with PXS[[:digit:]]@0
and the next lines, sed to remove lines containing PXS[[:digit:]]@0
from output.
答案2
得分: 0
你可以尝试类似以下的代码:
awk '/PXS[0-9][@]0/{getline;if(match($0,"model")){ print;}}'
/PXS[0-9][@]0/
将匹配前缀行
getline;
将读取下一行(并填充$0)
match($0,"model")
将查看该行是否包含正则表达式'model'。
英文:
you can try something like
awk '/PXS[0-9][@]0/{getline;if(match($0,"model")){ print;}}'
/PXS[0-9][@]0/
will match the prefix line
getline;
will read the next line (and populate $0)
match($0,"model")
will see if that line contains the regexp 'model'
答案3
得分: 0
感谢指导我方向正确。这给了我C、E和G行。
我在awk中使用了一个循环来找到第一行带有PXS[[:digit:]]@0
的内容,然后使用子循环找到第二行带有"model" = <
的内容。文件是确定的:如果第一行存在,第二行也会存在(但不会直接跟在第一行后面)。
我还设置了awk的分隔符,因为我想要的最终值是在"model" = \<"
之后和">
之前。
awk -F'<"|">' 'BEGIN {while (getline != 0) if ($0 ~ /PXS[[:digit:]]@0 /) {while (getline != 0) if ($0 ~ /"model" = </){print $2; break;}}}'
我喜欢把整个操作都放在一个awk命令中,我的另一个解决方案需要使用5个管道的grep。
英文:
Thanks for steering me in the right direction. This gives me lines C, E, and G.
I used a loop in awk to find the first line with PXS[[:digit:]]@0
and a sub loop to find the second with "model" = <
. The file is deterministic: if the first line is present, the second will be (but not directly after the first).
I also set awk's delimiters since the final value I want is after "model" = \<"
and before ">
.
awk -F'<"|">' 'BEGIN {while (getline != 0) if ($0 ~ /PXS[[:digit:]]@0 /) {while (getline != 0) if ($0 ~ /"model" = </){print $2; break;}}}'
I like having the whole thing in one awk command, my other solution required 5 piped greps.
答案4
得分: 0
$ awk '/model/ && match(prevline,/PXS[0-9]@0/){print} {prevline=$0}' file
C ... "model" = < ...
E ... "model" = < ...
G ... "model" = < ...
英文:
$ awk '/model/ && match(prevline,/PXS[0-9]@0/){print} {prevline=$0}' file
C ... "model" = \< ...
E ... "model" = \< ...
G ... "model" = \< ...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论