awk搜索一个模式并打印包含该模式和另一个条件的行。

huangapple go评论60阅读模式
英文:

awk to search a pattern and print lines with pattern and one more condition

问题

以下是您要翻译的内容:

我有一个如下所示的输入文件:

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

我希望有一个输出,如果某个策略(例如backup01)的任何一行具有“fail”状态,它将只显示“fail”状态的行,而不会显示“success”行。类似地,如果某个策略(例如backup02)的所有行都具有“successful”状态,它将打印所有“successful”状态的行。

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

我尝试使用awk,但成功有限,无法找到最终解决方案。

awk '{if ($4 == 0) {print $0} else if($4 !=0 && $4 == 0) {print $0}}' input_file.txt

也可以使用sed或其他工具,还可以忽略输入文件的标题。

英文:

I have a input file as shown below:

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

I want to have an output where if any line for a particular policy (example backup01 has a line with "fail" state, it will only show "fail" state lines and will not show the "success" lines. Similarly where all lines of a policy (example backup02) have a state "successful" it will print all the "successful" state lines.

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

I have tried using awk with little success, but not able to get ahead with a final solution.

awk '{if ($4 == 0) {print $0} else if($4 !=0 && $4 == 0) {print $0}}' input_file.txt

any other way using sed or other tool is also fine. ALso header from input file can also be ignored.

答案1

得分: 2

我将以以下方式利用GNU AWK,假设file.txt的内容如下:

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

然后运行以下awk命令:

awk 'NR>1{arr[$2][$3]=arr[$2][$3] RS $0}END{for(i in arr){print substr(arr[i]["fail" in arr[i]?"fail":"successful"],2)}}' file.txt

输出如下:

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

解释:对于第1行之后的行,我填充一个二维数组,第一个键是策略名称,第二个键是状态,我使用行分隔符附加整个行,以便值以新行分隔。当文件处理完毕后,我遍历该数组并打印对于给定策略名称下的 fail(如果存在)或 successful 下的内容。我从第二个字符开始打印,因为在填充数组时会出现前导换行符。免责声明:此解决方案假设您接受输出中的任何行顺序,如果不成立,请不要使用它。

(在GNU Awk 5.1.0中测试过)

英文:

I would harness GNU AWK following way, let file.txt content be

SR  policy name    state             error code
1   backup01       successful         0
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

then

awk 'NR>1{arr[$2][$3]=arr[$2][$3] RS $0}END{for(i in arr){print substr(arr[i]["fail" in arr[i]?"fail":"successful"],2)}}' file.txt

gives output

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

Explanation: for lines beyond 1st I populate 2D array, with first key being policy name and second state, I append whole line using row separator so values are lines separated by newlines. When file is processed I go through said array and print what is under fail if it is is available for given policy name else what is under successful. I start printing at 2nd character due to leading newline which appear due to way I populate array. Disclaimer: this solution assumes you are accepting any order of lines in output, if this does not hold do not use it.

(tested in GNU Awk 5.1.0)

答案2

得分: 2

以下是已翻译的代码部分:

awk '
NR>1 { # keep track of "fail" and "successful" entries in different arrays

       if ($3 == "fail"       ) fail[$2]    = fail[$2]    (fail[$2]    == "" ? "" : ORS) $0
       if ($3 == "successful" ) success[$2] = success[$2] (success[$2] == "" ? "" : ORS) $0
     }
END  { for (state in fail) {                # loop through fail[] array indices ...
           print fail[state]                # print the fail[] entry and ...
           delete success[state]            # delete any success[] entry for the same state
       }
       for (state in success)               # loop through remaining success[] array indices ...
           print success[state]             # print the success[] entry
     }
' input_file.txt

这段代码处理了输入文件并生成了相应的输出。如果你需要进一步的解释或有其他问题,请随时提出。

英文:

One awk idea:

awk '
NR>1 { # keep track of "fail" and "successful" entries in different arrays

       if ($3 == "fail"       ) fail[$2]    = fail[$2]    (fail[$2]    == "" ? "" : ORS) $0
       if ($3 == "successful" ) success[$2] = success[$2] (success[$2] == "" ? "" : ORS) $0
     }
END  { for (state in fail) {                # loop through fail[] array indices ...
           print fail[state]                # print the fail[] entry and ...
           delete success[state]            # delete any success[] entry for the same state
       }
       for (state in success)               # loop through remaining success[] array indices ...
           print success[state]             # print the success[] entry
     }
' input_file.txt

This generates:

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

NOTES:

  • while all lines for a given state will be successive, this solution does not guarantee the output will be ordered by state
  • if OP needs the output ordered by state one approach would be to pipe the output to sort (eg, awk '<see_script_above>' input_file.txt | sort -k2,2)

NOTE: original answer before OP provided additional details ...

Assumptions:

  • file is already sorted by the policy name (2nd) column
  • if the state (2nd) column contains any values other than fail or successful then the associated line is ignored

One awk idea requiring a single pass through the data file:

awk '
function print_lines() {
         if ( fail    != "" ) print fail
    else if ( success != "" ) print success
    fail = success = ""
}

NR>1 { if (prev != $2) print_lines()          # print previous policy
       prev = $2
       if ($3 == "fail"       ) fail    = fail    (fail    == "" ? "" : ORS) $0
       if ($3 == "successful" ) success = success (success == "" ? "" : ORS) $0
     }
END  { print_lines() }                        # print last policy
' input_file.txt

This generates:

2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

答案3

得分: 1

这可以通过同一个文件的两阶段处理来完成。首先,第一次遍历文件,根据策略的状态是否为"fail"或"successful"来建立状态,并在下一次遍历中排除矛盾的行。

awk '
  FNR == NR && NR > 1 { 
    t = key[$2]
    if ( $3 == "fail" || t == "fail" ) { 
      key[$2] = "fail" 
    } else {
      key[$2] = $3
    }
    next 
  } 
  key[$2] == $3
' file file
英文:

It can be done in two-stage processing of the same file. First pass, build up the status of the policy if it is going to be a "fail" or a "successful" and then exclude the contradicting lines in the next pass

awk '
  FNR == NR && NR > 1 { 
    t = key[$2]
    if ( $3 == "fail" || t == "fail" ) { 
      key[$2] = "fail" 
    } else {
      key[$2] = $3
    }
    next 
  } 
  key[$2] == $3
' file file

答案4

得分: 1

这是一个单行的awk命令来解决这个问题:

awk '
NR == 1 {
   print
   next
}
!fail {
   fail = ($3 == "fail")
}
fail {
   gsub(/[^\n]+ successful [^\n]+\n/, "", s)
}
p != $2 {
   printf "%s", s
   s = fail = ""
}
{
   s = s $0 ORS
   p = $2
}
END {
   printf "%s", s
}' file

SR  策略名称    状态             错误代码
2   backup01       失败               13
3   backup01       失败               58
4   backup02       成功         0
5   backup02       成功         0
6   backup02       成功         0
英文:

Here is a single pass awk to solve the problem:

awk '
NR == 1 {
   print
   next
}
!fail {
   fail = ($3 == "fail")
}
fail {
   gsub(/[^\n]+ successful [^\n]+\n/, "", s)
}
p != $2 {
   printf "%s", s
   s = fail = ""
}
{
   s = s $0 ORS
   p = $2
}
END {
   printf "%s", s
}' file

SR  policy name    state             error code
2   backup01       fail               13
3   backup01       fail               58
4   backup02       successful         0
5   backup02       successful         0
6   backup02       successful         0

答案5

得分: 1

这可能对你有用(GNU sed):

sed -E '1d;:a;$!{N;/^(\S+\s+){2}.*[^\n]+$/ba};$!{h;s/\n[^\n]+$//} /fail/{s/.*successful.*//mg;s/(^\n|(\n))\n*//gp} /successful/p;$d;x;s/^.*\n//;ba' file

去掉头部。

继续获取行直到文件末尾或第二字段发生变化。

如果不是文件末尾,制作一份副本并移除最后一行(第二字段的变化)。

如果行包含fail,则移除包含单词successful的所有行(及其换行符),然后打印结果。

否则,打印所有行,因为它们是successful

如果是文件末尾,无需进一步处理。

否则,切换到副本并移除除最后一行以外的所有行(已处理的行),然后重复。

注:此方法期望第二字段,即关键字,处于排序顺序。

英文:

This might work for you (GNU sed):

sed -E '1d;:a;$!{N;/^(\S+\s+){2}.*[^\n]+$/ba};$!{h;s/\n[^\n]+$//}
        /fail/{s/.*successful.*//mg;s/(^\n|(\n))\n*//gp}
        /successful/p;$d;x;s/^.*\n//;ba' file

Ditch the header.

Continue to fetch lines until the end-of-file or a change in the second field.

If it is not the end-of-file, make a copy and remove the last line (the change in the second field).

If the lines contain fail, then remove any lines (and their linefeeds) that contain the word successful and print the result.

Otherwise, print all the lines as they are successful

If it end-of-file, no further processing is needed.

Otherwise, swap to the copy and remove all the lines other that the last (lines already processed) and repeat.

N.B. This expects the second field i.e. the key, to be in sorted order.

huangapple
  • 本文由 发表于 2023年7月10日 22:29:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76654761.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定