在前一行的基础上,匹配“Exception”,但过滤掉一个特定的情况。

huangapple go评论39阅读模式
英文:

Grep "Exception" but filter out one specific case, based on previous line

问题

在我的应用程序中,我已经修改了所有的IP地址,以免干扰实际的生产系统。结果,我的应用程序抛出了大量的异常。这些异常被保存在一个叫做filename的日志文件中。

我想要过滤这些异常,但我不想看到由于修改IP地址而引起的异常。

这听起来很容易,因为这些异常前面都有一行包含Failed to connect的内容。

让我们看看如何做到这一点:

过滤异常:

grep "Exception" filename

同时显示前一行:

grep -B 1 "Exception" filename

不要显示包含"Failed to connect"的行:

grep -B 1 "Exception" filename | grep -v "Failed to connect"

=> 不,这不是我想要的:这会过滤掉包含"Failed to connect"这些词的行,但实际的异常仍然会显示出来。我怎样才能过滤掉那些异常呢?

我的filename内容大致如下:

... Failed to connect ...
... Exception ...
...
... (大量这些行)
...
... <与"Failed to connect"不同的内容>
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (再次大量这些行)
...

我只对那些不是由"Failed to connect"开头的... Exception ...行感兴趣。

当我按下man grep时,显示的信息如下:

GNU grep 3.4 ... 2019-12-29

有没有人有什么主意呢?
谢谢。

英文:

In my application, I have modified all IP addresses, in order not to disturb the actual production system. As a result, my application is throwing lots of exceptions. Those are kept in a logfile, called filename.

I would like to filter the exceptions, but I don't want to see the ones, caused by the modification of the IP addresses.

This sounds very easy, because those exceptions are preceded by a line, containing Failed to connect.

Let's see how to do this:

Filter on exceptions:

grep &quot;Exception&quot; filename

Show also the previous line:

grep -B 1 &quot;Exception&quot; filename

Do not show the lines, containing "Failed to connect":

grep -B 1 &quot;Exception filename | grep -v &quot;Failed to connect&quot;

=> No, this is not what I want: this filters out the lines, containing the words "Failed to connect", but the actual exceptions are still shown. How can I not only filter out the exceptions too?

My filename contents are something like:

... Failed to connect ...
... Exception ...
...
... (lots of these)
...
... &lt;something else than &quot;Failed to connect&quot;&gt;
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (again lots of these)
...

I'm only interested in the lines ... Exception ... who are not preceded by "Failed to connect".

When I press man grep, it ends with:

GNU grep 3.4 ... 2019-12-29

Does anybody have an idea?
Thanks in advance

答案1

得分: 1

使用 gnu-grep,您可以尝试执行以下操作:

grep -zoP '(?m)^(?!.*Failed to connect).+\R.*Exception.*\R' file

其中文件内容为:

.. Failed to connect ...
... Exception ...
...
... (很多这些)
...
... foo bar baz
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (再次有很多这些)

正则表达式演示

命令详解:

  • -z:在整个文件上操作,而不是逐行操作
  • -o:只返回匹配的文本
  • -P:启用PCRE模式
  • (?m):启用多行模式
  • ^:匹配行的开头
  • (?!.*Failed to connect):负向先行断言,用于确保我们在任何地方都没有 Failed to connect
  • .+\R:匹配1个或多个任意字符,后跟一个换行符
  • .*Exception.*\R:匹配0个或多个任意字符,然后是文本 Exception,再次是0个或多个任意字符,最后是一个换行符
英文:

Using gnu-grep you may be able to do this:

grep -zoP &#39;(?m)^(?!.*Failed to connect).+\R.*Exception.*\R&#39; file

... foo bar baz
... Exception ...

# where file content is
cat file

.. Failed to connect ...
... Exception ...
...
... (lots of these)
...
... foo bar baz
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (again lots of these)
...

RegEx Demo

Command Details:

  • -z: Operate on full file instead of one line at a time
  • -o: Return only matched text
  • -P: Enable PCRE mode
  • (?m): Enable MULTILINE mode
  • ^: Match a line start
  • (?!.*Failed to connect): Negative lookahead to assert failure when we have Failed to connect anywhere on a line
  • .+\R: Match 1+ of any characters followed by a line break
  • .*Exception.*\R: Match 0+ of any characters then text Exception then again 0 or more of any characters followed by a line break

答案2

得分: 1

用你提供的示例,请尝试以下GNU awk代码。在GNU awk中编写并测试,将RS设置为[^\n]*\\n[^E]*Exception[^\n]*,然后使用match函数仅获取所需部分,存储在GNU awkRT变量中。

awk -v RS='[^\n]*\\n[^E]*Exception[^\n]*' '
RT{
  if(RT!~/Failed to connect/ && RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    sub(/^\n/,"",arr[1])
    print arr[1],arr[2]
  }
}
' Input_file

或者 我们可以将上述代码修改为更高效的版本:

awk -v RS='[^\n]*\\n[^E]*Exception[^\n]*' '
RT{
  if(RT!~/Failed to connect/ && RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    print arr[2]
  }
}
' Input_file
英文:

With your shown samples, please try following GNU awk code. Written and tested in GNU awk with setting RS to [^\n]*\\n[^E]*Exception[^\n]* and then using match function to only get the required part as per shown output in RT variable of GNU awk.

awk -v RS=&#39;[^\n]*\\n[^E]*Exception[^\n]*&#39; &#39;
RT{
  if(RT!~/Failed to connect/ &amp;&amp; RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    sub(/^\n/,&quot;&quot;,arr[1])
    print arr[1],arr[2]
  }
}
&#39; Input_file

OR we could make above more efficient as following:

awk -v RS=&#39;[^\n]*\\n[^E]*Exception[^\n]*&#39; &#39;
RT{
  if(RT!~/Failed to connect/ &amp;&amp; RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    print arr[2]
  }
}
&#39; Input_file

huangapple
  • 本文由 发表于 2023年2月8日 18:30:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75384440.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定