2023年7月10日 22:29:57go评论80阅读模式

英文:

awk to search a pattern and print lines with pattern and one more condition

问题

以下是您要翻译的内容：

我有一个如下所示的输入文件：

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

我希望有一个输出，如果某个策略（例如backup01）的任何一行具有“fail”状态，它将只显示“fail”状态的行，而不会显示“success”行。类似地，如果某个策略（例如backup02）的所有行都具有“successful”状态，它将打印所有“successful”状态的行。

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

我尝试使用awk，但成功有限，无法找到最终解决方案。

awk '{if ($4 == 0) {print $0} else if($4 !=0 && $4 == 0) {print $0}}' input_file.txt

也可以使用sed或其他工具，还可以忽略输入文件的标题。

英文:

I have a input file as shown below:

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

I want to have an output where if any line for a particular policy (example backup01 has a line with "fail" state, it will only show "fail" state lines and will not show the "success" lines. Similarly where all lines of a policy (example backup02) have a state "successful" it will print all the "successful" state lines.

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

I have tried using awk with little success, but not able to get ahead with a final solution.

awk &#39;{if ($4 == 0) {print $0} else if($4 !=0 &amp;&amp; $4 == 0) {print $0}}&#39; input_file.txt

any other way using sed or other tool is also fine. ALso header from input file can also be ignored.

答案1

得分: 2

我将以以下方式利用GNU AWK，假设file.txt的内容如下：

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

然后运行以下awk命令：

awk 'NR>1{arr[$2][$3]=arr[$2][$3] RS $0}END{for(i in arr){print substr(arr[i]["fail" in arr[i]?"fail":"successful"],2)}}' file.txt

输出如下：

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

解释：对于第1行之后的行，我填充一个二维数组，第一个键是策略名称，第二个键是状态，我使用行分隔符附加整个行，以便值以新行分隔。当文件处理完毕后，我遍历该数组并打印对于给定策略名称下的 fail（如果存在）或 successful 下的内容。我从第二个字符开始打印，因为在填充数组时会出现前导换行符。免责声明：此解决方案假设您接受输出中的任何行顺序，如果不成立，请不要使用它。

（在GNU Awk 5.1.0中测试过）

英文:

I would harness GNU AWK following way, let file.txt content be

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

then

awk &#39;NR&gt;1{arr[$2][$3]=arr[$2][$3] RS $0}END{for(i in arr){print substr(arr[i][&quot;fail&quot; in arr[i]?&quot;fail&quot;:&quot;successful&quot;],2)}}&#39; file.txt

gives output

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

Explanation: for lines beyond 1st I populate 2D array, with first key being policy name and second state, I append whole line using row separator so values are lines separated by newlines. When file is processed I go through said array and print what is under fail if it is is available for given policy name else what is under successful. I start printing at 2nd character due to leading newline which appear due to way I populate array. Disclaimer: this solution assumes you are accepting any order of lines in output, if this does not hold do not use it.

(tested in GNU Awk 5.1.0)

答案2

得分: 2

以下是已翻译的代码部分：

awk '
NR>1 { # keep track of "fail" and "successful" entries in different arrays
       if ($3 == "fail"       ) fail[$2]    = fail[$2]    (fail[$2]    == "" ? "" : ORS) $0
       if ($3 == "successful" ) success[$2] = success[$2] (success[$2] == "" ? "" : ORS) $0
     }
END  { for (state in fail) {                # loop through fail[] array indices ...
           print fail[state]                # print the fail[] entry and ...
           delete success[state]            # delete any success[] entry for the same state
       }
       for (state in success)               # loop through remaining success[] array indices ...
           print success[state]             # print the success[] entry
     }
' input_file.txt

这段代码处理了输入文件并生成了相应的输出。如果你需要进一步的解释或有其他问题，请随时提出。

英文:

One awk idea:

awk &#39;
NR&gt;1 { # keep track of &quot;fail&quot; and &quot;successful&quot; entries in different arrays
       if ($3 == &quot;fail&quot;       ) fail[$2]    = fail[$2]    (fail[$2]    == &quot;&quot; ? &quot;&quot; : ORS) $0
       if ($3 == &quot;successful&quot; ) success[$2] = success[$2] (success[$2] == &quot;&quot; ? &quot;&quot; : ORS) $0
     }
END  { for (state in fail) {                # loop through fail[] array indices ...
           print fail[state]                # print the fail[] entry and ...
           delete success[state]            # delete any success[] entry for the same state
       }
       for (state in success)               # loop through remaining success[] array indices ...
           print success[state]             # print the success[] entry
     }
&#39; input_file.txt

This generates:

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

NOTES:

while all lines for a given state will be successive, this solution does not guarantee the output will be ordered by state
if OP needs the output ordered by state one approach would be to pipe the output to sort (eg, awk '<see_script_above>' input_file.txt | sort -k2,2)

NOTE: original answer before OP provided additional details ...

Assumptions:

file is already sorted by the policy name (2nd) column
if the state (2nd) column contains any values other than fail or successful then the associated line is ignored

One awk idea requiring a single pass through the data file:

awk &#39;
function print_lines() {
         if ( fail    != &quot;&quot; ) print fail
    else if ( success != &quot;&quot; ) print success
    fail = success = &quot;&quot;
}
NR&gt;1 { if (prev != $2) print_lines()          # print previous policy
       prev = $2
       if ($3 == &quot;fail&quot;       ) fail    = fail    (fail    == &quot;&quot; ? &quot;&quot; : ORS) $0
       if ($3 == &quot;successful&quot; ) success = success (success == &quot;&quot; ? &quot;&quot; : ORS) $0
     }
END  { print_lines() }                        # print last policy
&#39; input_file.txt

This generates:

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

答案3

得分: 1

这可以通过同一个文件的两阶段处理来完成。首先，第一次遍历文件，根据策略的状态是否为"fail"或"successful"来建立状态，并在下一次遍历中排除矛盾的行。

awk '
  FNR == NR && NR > 1 { 
    t = key[$2]
    if ( $3 == "fail" || t == "fail" ) { 
      key[$2] = "fail" 
    } else {
      key[$2] = $3
    }
    next 
  } 
  key[$2] == $3
' file file

英文:

It can be done in two-stage processing of the same file. First pass, build up the status of the policy if it is going to be a "fail" or a "successful" and then exclude the contradicting lines in the next pass

awk &#39;
  FNR == NR &amp;&amp; NR &gt; 1 { 
    t = key[$2]
    if ( $3 == &quot;fail&quot; || t == &quot;fail&quot; ) { 
      key[$2] = &quot;fail&quot; 
    } else {
      key[$2] = $3
    }
    next 
  } 
  key[$2] == $3
&#39; file file

答案4

得分: 1

这是一个单行的awk命令来解决这个问题：

awk '
NR == 1 {
   print
   next
}
!fail {
   fail = ($3 == "fail")
}
fail {
   gsub(/[^\n]+ successful [^\n]+\n/, "", s)
}
p != $2 {
   printf "%s", s
   s = fail = ""
}
{
   s = s $0 ORS
   p = $2
}
END {
   printf "%s", s
}' file
SR  策略名称    状态             错误代码
2   backup01       失败               13
3   backup01       失败               58
4   backup02       成功         0
5   backup02       成功         0
6   backup02       成功         0

英文:

Here is a single pass awk to solve the problem:

awk &#39;
NR == 1 {
   print
   next
}
!fail {
   fail = ($3 == &quot;fail&quot;)
}
fail {
   gsub(/[^\n]+ successful [^\n]+\n/, &quot;&quot;, s)
}
p != $2 {
   printf &quot;%s&quot;, s
   s = fail = &quot;&quot;
}
{
   s = s $0 ORS
   p = $2
}
END {
   printf &quot;%s&quot;, s
}&#39; file
SR  policy name    state             error code
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

答案5

得分: 1

这可能对你有用（GNU sed）：

sed -E '1d;:a;$!{N;/^(\S+\s+){2}.*[^\n]+$/ba};$!{h;s/\n[^\n]+$//} /fail/{s/.*successful.*//mg;s/(^\n|(\n))\n*//gp} /successful/p;$d;x;s/^.*\n//;ba' file

去掉头部。

继续获取行直到文件末尾或第二字段发生变化。

如果不是文件末尾，制作一份副本并移除最后一行（第二字段的变化）。

如果行包含fail，则移除包含单词successful的所有行（及其换行符），然后打印结果。

否则，打印所有行，因为它们是successful。

如果是文件末尾，无需进一步处理。

否则，切换到副本并移除除最后一行以外的所有行（已处理的行），然后重复。

注：此方法期望第二字段，即关键字，处于排序顺序。

英文:

This might work for you (GNU sed):

sed -E &#39;1d;:a;$!{N;/^(\S+\s+){2}.*[^\n]+$/ba};$!{h;s/\n[^\n]+$//}
        /fail/{s/.*successful.*//mg;s/(^\n|(\n))\n*//gp}
        /successful/p;$d;x;s/^.*\n//;ba&#39; file

Ditch the header.

Continue to fetch lines until the end-of-file or a change in the second field.

If it is not the end-of-file, make a copy and remove the last line (the change in the second field).

If the lines contain fail, then remove any lines (and their linefeeds) that contain the word successful and print the result.

Otherwise, print all the lines as they are successful

If it end-of-file, no further processing is needed.

Otherwise, swap to the copy and remove all the lines other that the last (lines already processed) and repeat.

N.B. This expects the second field i.e. the key, to be in sorted order.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

awk搜索一个模式并打印包含该模式和另一个条件的行。

问题

答案1

答案2

答案3

答案4

答案5

为每个空行或匹配递增字段而使用sed。

自定义日志处理/解析

使用awk将不同列上执行计算。

用Bash脚本中的sed替换文件中的版本号。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论