使用awk循环文件并为两列打印新行的方法

huangapple go评论47阅读模式
英文:

How to for loop file and print new line with awk for two columns

问题

这是我的文件,我正在尝试处理第4和第5列:

Server3有61个PG Plan1
Server3有29个PG Plan6
Server4有18个PG Plan4
Server3有21个PG Plan2
Server6有31个PG Plan8
Server8有12个PG Plan3
Server3有10个BG Plan7
Server5有45个BG Plan5

这在逻辑上是正确的:

awk '{print $4, $5}' file

输出:

PG Plan1
PG Plan6
PG Plan4
PG Plan2
PG Plan8
PG Plan3
BG Plan7
BG Plan5

但是当我尝试使用for循环时,所有内容都打印在同一行:

cat file | while read line; do echo $(awk '{print $4, $5}'); done

输出:

PG Plan6 PG Plan4 PG Plan2 PG Plan8 PG Plan3 BG Plan7 BG Plan5

我尝试了以下方法,但仍然看到上面的输出:

cat file | while read line; do echo $(awk '{print $4} {print $5} {printf "\n"}'); done
cat file | while read line; do echo $(awk '{print $4, $5}'); done

如何在同一行上打印第4列和第5列的两个字段,然后为另一个循环打印一个\n新行?

期望的输出是:

PG Plan1
PG Plan6
PG Plan4
PG Plan2
PG Plan8
PG Plan3
BG Plan7
BG Plan5

更新 1

在评论中被问到为什么不使用awk '{print $4, $5}' file,当它可以工作并显示预期的输出时,目标是比较两个文件,如果它们匹配,则执行某些操作。

这是file(在此之前已经提到):

Server3有61个PG Plan1
Server3有29个PG Plan6
Server4有18个PG Plan4
Server3有21个PG Plan2
Server6有31个PG Plan8
Server8有12个PG Plan3
Server3有10个BG Plan7
Server5有45个BG Plan5

这是anotherfile

-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan1"
  monthly: 100.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan6"
  monthly: 187.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan4"
  monthly: 155.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan2"
  monthly: 125.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan8"
  monthly: 225.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan3"
  monthly: 140.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "BG Plan7"
  monthly: 200.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "BG Plan5"
  monthly: 176.00
-

最终目标(我一步一步来,但还不知道如何达到下面的命令):

对于file中的每一行和第4列和第5列,检查name是否等于anotherfile中的name,然后将file中的第3列乘以anotherfile中的monthly,并计算每个结果。

期望的输出:

Server3有61个PG Plan1(利润6,100)
Server3有29个PG Plan6(利润3,625)
Server4有18个PG Plan4(利润2,790)
Server3有21个PG Plan2(利润2,625)
Server6有31个PG Plan8(利润6,975)
Server8有12个PG Plan3(利润1,680)
Server3有10个BG Plan7(利润2,000)
Server5有45个BG Plan5(利润7,920)
英文:

This is my file and I'm trying to process with the 4th and 5th column:

Server3 has 61 PG Plan1
Server3 has 29 PG Plan6
Server4 has 18 PG Plan4
Server3 has 21 PG Plan2
Server6 has 31 PG Plan8
Server8 has 12 PG Plan3
Server3 has 10 BG Plan7
Server5 has 45 BG Plan5

This shows fine (logically):

awk '{print $4, $5}' file

Output:

PG Plan1
PG Plan6
PG Plan4
PG Plan2
PG Plan8
PG Plan3
BG Plan7
BG Plan5

But when I try a for loop, all are printed in the same line:

cat file | while read line; do echo $(awk '{print $4, $5}'); done

Output:

PG Plan6 PG Plan4 PG Plan2 PG Plan8 PG Plan3 BG Plan7 BG Plan5

I tried these but I still see the above output:

cat file | while read line; do echo $(awk '{print $4} {print $5} {printf "\n"}'); done
cat file | while read line; do echo $(awk '{print $4, $5}'); done

How can I print two columns of 4th and 5th together in the same line and then print a \n new line for another loop?

Expected output is:

PG Plan1
PG Plan6
PG Plan4
PG Plan2
PG Plan8
PG Plan3
BG Plan7
BG Plan5

Update 1

As asked in the comments why not using awk '{print $4, $5}' file when it works and shows expected output, the goal is to compare to files and if they match, do something.

This is file (which stated before here):

Server3 has 61 PG Plan1
Server3 has 29 PG Plan6
Server4 has 18 PG Plan4
Server3 has 21 PG Plan2
Server6 has 31 PG Plan8
Server8 has 12 PG Plan3
Server3 has 10 BG Plan7
Server5 has 45 BG Plan5

This is anotherfile:

-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan1"
  monthly: 100.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan6"
  monthly: 187.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan4"
  monthly: 155.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan2"
  monthly: 125.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan8"
  monthly: 225.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "PG Plan3"
  monthly: 140.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "BG Plan7"
  monthly: 200.00
-
  gpid: 46
  gpname: "Some MadeUp Name (Not Important)"
  pid: 308
  name: "BG Plan5"
  monthly: 176.00
-

The final goal (I'm going step-by-step but I don't know yet the command to reach below):

> for every line in file and column 4 and 5, check if the name == to the name in anotherfile, then multiply the 3rd column of file by monthly in anotherfile and calculate the result of each one.

Expected output:

Server3 has 61 PG Plan1 (profit 6,100)
Server3 has 29 PG Plan6 (profit 3,625)
Server4 has 18 PG Plan4 (profit 2,790)
Server3 has 21 PG Plan2 (profit 2,625)
Server6 has 31 PG Plan8 (profit 6,975)
Server8 has 12 PG Plan3 (profit 1,680)
Server3 has 10 BG Plan7 (profit 2,000)
Server5 has 45 BG Plan5 (profit 7,920)

答案1

得分: 3

在处理原始问题时,关于使用while/read循环来处理行并显示第四和第五字段的内容...

假设:

  • OP需要在while循环内以某种(尚未定义的)方式使用$line以及$line中的第四和第五个字段,否则使用一个简单的一行awk脚本来打印第四/第五个字段会更容易/更快。

一种方法:

while read -r line
do
    read -r f1 f2 f3 f4 f5 rest_of_line <<< "$line"     # 在空格上分割$line并将内容放入变量f{1..5}和rest_of_line中(以防可能有超过5个字段)
    echo "$line : $f4 : $f5"
done < file

注意:

  • 这消除了不必要的cat的使用。
  • 这消除了所有子shell的调用(例如,cat file |$(awk ...))。
  • 这允许在while循环内填充其他变量,然后在脚本后面(在while循环之外)使用这些变量。

这将生成:

Server3 has 61 PG Plan1 : PG : Plan1
Server3 has 29 PG Plan6 : PG : Plan6
Server4 has 18 PG Plan4 : PG : Plan4
Server3 has 21 PG Plan2 : PG : Plan2
Server6 has 31 PG Plan8 : PG : Plan8
Server8 has 12 PG Plan3 : PG : Plan3
Server3 has 10 BG Plan7 : BG : Plan7
Server5 has 45 BG Plan5 : BG : Plan5

如果OP不需要$line的内容,那么我们可以合并两个read命令:

while read -r f1 f2 f3 f4 f5 rest_of_line
do
    echo "$f4 : $f5"
done < file

这将生成:

PG : Plan1
PG : Plan6
PG : Plan4
PG : Plan2
PG : Plan8
PG : Plan3
BG : Plan7
BG : Plan5
英文:

Addressing OP's first/original question re: using a while/read loop to process lines and display the 4th and 5th fields ...

Assumptions:

  • OP needs both $line as well as the 4th and 5th fields (from $line) for some (as yet undefined) purpose within the while loop, otherwise a simple one-line awk script would be easier/faster for printing the 4th/5th fields

One approach:

while read -r line
do
    read -r f1 f2 f3 f4 f5 rest_of_line &lt;&lt;&lt; &quot;$line&quot;     # split $line on spaces placing contents in variables f{1..5} and rest_of_line (just in case there might be more than 5 fields)
    echo &quot;$line : $f4 : $f5&quot;
done &lt; file

NOTES:

  • this eliminates the unnecessary use of cat
  • this elimnates all subshell invocations (eg, cat file | , $(awk ...))
  • this allows other variables to be populated insided the while loop and then used later in the script (outside the while loop)

This generates:

Server3 has 61 PG Plan1 : PG : Plan1
Server3 has 29 PG Plan6 : PG : Plan6
Server4 has 18 PG Plan4 : PG : Plan4
Server3 has 21 PG Plan2 : PG : Plan2
Server6 has 31 PG Plan8 : PG : Plan8
Server8 has 12 PG Plan3 : PG : Plan3
Server3 has 10 BG Plan7 : BG : Plan7
Server5 has 45 BG Plan5 : BG : Plan5

If OP does not need the contents of $line then we can combine the two read commands:

while read -r f1 f2 f3 f4 f5 rest_of_line
do
    echo &quot;$f4 : $f5&quot;
done &lt; file

This generates:

PG : Plan1
PG : Plan6
PG : Plan4
PG : Plan2
PG : Plan8
PG : Plan3
BG : Plan7
BG : Plan5

答案2

得分: 2

$ cat tst.awk
NR == FNR {
    if ( /^-/ ) {
        name2monthly[tag2val["name"]] = tag2val["monthly"]
    }
    else {
        gsub(/^[[:space:]]+|[[:space:]]+$/,"")
        tag = val = $0
        sub(/:.*/,"",tag)
        sub(/[^:]*:[[:space:]]*/,"",val)
        gsub(/^"|\"$/,"",val)
        tag2val[tag] = val
    }
    next
}
{
    printf "%s (利润 %'d)\n", $0, $3 * name2monthly[$4" "$5]
}

$ LC_ALL=en_US.UTF-8 awk -f tst.awk anotherfile file
Server3  61  PG Plan1 (利润 6,100)
Server3  29  PG Plan6 (利润 5,423)
Server4  18  PG Plan4 (利润 2,790)
Server3  21  PG Plan2 (利润 2,625)
Server6  31  PG Plan8 (利润 6,975)
Server8  12  PG Plan3 (利润 1,680)
Server3  10  BG Plan7 (利润 2,000)
Server5  45  BG Plan5 (利润 7,920)
英文:
$ cat tst.awk
NR == FNR {
    if ( /^-/ ) {
        name2monthly[tag2val[&quot;name&quot;]] = tag2val[&quot;monthly&quot;]
    }
    else {
        gsub(/^[[:space:]]+|[[:space:]]+$/,&quot;&quot;)
        tag = val = $0
        sub(/:.*/,&quot;&quot;,tag)
        sub(/[^:]*:[[:space:]]*/,&quot;&quot;,val)
        gsub(/^&quot;|&quot;$/,&quot;&quot;,val)
        tag2val[tag] = val
    }
    next
}
{
    printf &quot;%s (profit %7d)\n&quot;, $0, $3 * name2monthly[$4&quot; &quot;$5]
}

<p>

$ LC_ALL=en_US.UTF-8 awk -f tst.awk anotherfile file
Server3 has 61 PG Plan1 (profit 6,100)
Server3 has 29 PG Plan6 (profit 5,423)
Server4 has 18 PG Plan4 (profit 2,790)
Server3 has 21 PG Plan2 (profit 2,625)
Server6 has 31 PG Plan8 (profit 6,975)
Server8 has 12 PG Plan3 (profit 1,680)
Server3 has 10 BG Plan7 (profit 2,000)
Server5 has 45 BG Plan5 (profit 7,920)

The above is first creating an array of tag to value pairs (e.g. tag pid maps to value 308 in every record) for every line in the current --separate record of anotherfile, then creating an array name2monthly that maps every name value to the monthly value in that same record at the end of each record.

Then it reads file and just multiplies $3 by the monthly value stored in name2monthly[] for the given $4 $5 name pair.

答案3

得分: 0

这似乎是echo$line变量在问题中的使用方式有问题。以下代码应该可以解决这个问题:

cat file | while read line; do echo "$line" | awk '{print $4, $5}'; done

如果绝对需要子shell,下面的代码也可以工作:

cat file | while read line; do echo $(echo "$line" | awk '{print $4, $5}'); done
英文:

It looks like the echo and the $line variable get lost the way they are used in the question. This should work:

cat file | while read line; do echo &quot;$line&quot; | awk &#39;{print $4, $5}&#39;; done

If a subshell is absolutely necessary, this should work too:

cat file | while read line; do echo $(echo &quot;$line&quot; | awk &#39;{print $4, $5}&#39;); done

huangapple
  • 本文由 发表于 2023年6月18日 21:13:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76500719.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定