2023年2月8日 18:30:03go评论65阅读模式

英文:

Grep "Exception" but filter out one specific case, based on previous line

问题

在我的应用程序中，我已经修改了所有的IP地址，以免干扰实际的生产系统。结果，我的应用程序抛出了大量的异常。这些异常被保存在一个叫做filename的日志文件中。

我想要过滤这些异常，但我不想看到由于修改IP地址而引起的异常。

这听起来很容易，因为这些异常前面都有一行包含Failed to connect的内容。

让我们看看如何做到这一点：

过滤异常：

grep "Exception" filename

同时显示前一行：

grep -B 1 "Exception" filename

不要显示包含"Failed to connect"的行：

grep -B 1 "Exception" filename | grep -v "Failed to connect"

=> 不，这不是我想要的：这会过滤掉包含"Failed to connect"这些词的行，但实际的异常仍然会显示出来。我怎样才能过滤掉那些异常呢？

我的filename内容大致如下：

... Failed to connect ...
... Exception ...
...
... (大量这些行)
...
... <与"Failed to connect"不同的内容>
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (再次大量这些行)
...

我只对那些不是由"Failed to connect"开头的... Exception ...行感兴趣。

当我按下man grep时，显示的信息如下：

GNU grep 3.4 ... 2019-12-29

有没有人有什么主意呢？
谢谢。

英文:

In my application, I have modified all IP addresses, in order not to disturb the actual production system. As a result, my application is throwing lots of exceptions. Those are kept in a logfile, called filename.

I would like to filter the exceptions, but I don't want to see the ones, caused by the modification of the IP addresses.

This sounds very easy, because those exceptions are preceded by a line, containing Failed to connect.

Let's see how to do this:

Filter on exceptions:

grep &quot;Exception&quot; filename

Show also the previous line:

grep -B 1 &quot;Exception&quot; filename

Do not show the lines, containing "Failed to connect":

grep -B 1 &quot;Exception filename | grep -v &quot;Failed to connect&quot;

=> No, this is not what I want: this filters out the lines, containing the words "Failed to connect", but the actual exceptions are still shown. How can I not only filter out the exceptions too?

My filename contents are something like:

... Failed to connect ...
... Exception ...
...
... (lots of these)
...
... &lt;something else than &quot;Failed to connect&quot;&gt;
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (again lots of these)
...

I'm only interested in the lines ... Exception ... who are not preceded by "Failed to connect".

When I press man grep, it ends with:

GNU grep 3.4 ... 2019-12-29

Does anybody have an idea?
Thanks in advance

答案1

得分: 1

使用 gnu-grep，您可以尝试执行以下操作：

grep -zoP '(?m)^(?!.*Failed to connect).+\R.*Exception.*\R' file

其中文件内容为：

.. Failed to connect ...
... Exception ...
...
... (很多这些)
...
... foo bar baz
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (再次有很多这些)

正则表达式演示

命令详解：

-z：在整个文件上操作，而不是逐行操作
-o：只返回匹配的文本
-P：启用PCRE模式
(?m)：启用多行模式
^：匹配行的开头
(?!.*Failed to connect)：负向先行断言，用于确保我们在任何地方都没有 Failed to connect
.+\R：匹配1个或多个任意字符，后跟一个换行符
.*Exception.*\R：匹配0个或多个任意字符，然后是文本 Exception，再次是0个或多个任意字符，最后是一个换行符

英文:

Using gnu-grep you may be able to do this:

grep -zoP &#39;(?m)^(?!.*Failed to connect).+\R.*Exception.*\R&#39; file
... foo bar baz
... Exception ...
# where file content is
cat file
.. Failed to connect ...
... Exception ...
...
... (lots of these)
...
... foo bar baz
... Exception ...
...
... Failed to connect ...
... Exception ...
...
... (again lots of these)
...

RegEx Demo

Command Details:

-z: Operate on full file instead of one line at a time
-o: Return only matched text
-P: Enable PCRE mode
(?m): Enable MULTILINE mode
^: Match a line start
(?!.*Failed to connect): Negative lookahead to assert failure when we have Failed to connect anywhere on a line
.+\R: Match 1+ of any characters followed by a line break
.*Exception.*\R: Match 0+ of any characters then text Exception then again 0 or more of any characters followed by a line break

答案2

得分: 1

用你提供的示例，请尝试以下GNU awk代码。在GNU awk中编写并测试，将RS设置为[^\n]*\\n[^E]*Exception[^\n]*，然后使用match函数仅获取所需部分，存储在GNU awk的RT变量中。

awk -v RS='[^\n]*\\n[^E]*Exception[^\n]*' '
RT{
  if(RT!~/Failed to connect/ && RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    sub(/^\n/,"",arr[1])
    print arr[1],arr[2]
  }
}
' Input_file

或者我们可以将上述代码修改为更高效的版本：

awk -v RS='[^\n]*\\n[^E]*Exception[^\n]*' '
RT{
  if(RT!~/Failed to connect/ && RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    print arr[2]
  }
}
' Input_file

英文:

With your shown samples, please try following GNU awk code. Written and tested in GNU awk with setting RS to [^\n]*\\n[^E]*Exception[^\n]* and then using match function to only get the required part as per shown output in RT variable of GNU awk.

awk -v RS=&#39;[^\n]*\\n[^E]*Exception[^\n]*&#39; &#39;
RT{
  if(RT!~/Failed to connect/ &amp;&amp; RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    sub(/^\n/,&quot;&quot;,arr[1])
    print arr[1],arr[2]
  }
}
&#39; Input_file

OR we could make above more efficient as following:

awk -v RS=&#39;[^\n]*\\n[^E]*Exception[^\n]*&#39; &#39;
RT{
  if(RT!~/Failed to connect/ &amp;&amp; RT~/Exception/){
    match(RT,/(^|\n)([^\n]*\n[^\n]*$)/,arr)
    print arr[2]
  }
}
&#39; Input_file

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在前一行的基础上，匹配“Exception”，但过滤掉一个特定的情况。

问题

答案1

答案2

如何使用PHP从WordPress调整大小的图像中获取完整大小的图像URL。

正则表达式范围从负值到正值

为什么我的正则表达式在Go中总是失败？

从字符串中移除破折号，但在被（a-z）包围时不移除破折号。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。