Interfacing bash with awk: 用awk与bash进行接口连接:

huangapple go评论55阅读模式
英文:

Interfacing bash with awk:

问题

我是一个awk的初学者,正在尝试调整一些与awk实用程序接口的工作中的一些bash脚本,以解析一些系统文件的输出。我根据自己的理解进行了更改,后来我觉得最好还是写一个简单的测试程序来尝试相同的逻辑模式。但令我失望的是,测试程序并没有按预期工作。我在下面重现了我编写的bash脚本和awk实用程序脚本,以及我遇到的错误。感谢您的想法。

bash脚本
-----------

#!/bin/bash

string=$'a\nb\nc\nd\ne'
echo "$string"

awk -v input=${string} -f post.awk < file_input.txt > file_output.txt


awk脚本
----------

#!/bin/gawk

BEGIN {
  
  getline tmp
  print tmp > $3

}

END{

 print $1 > $3

}

在awk脚本中,我试图在BEGIN块内读取file_input.txt中的单行"Only one line",并将其写入file_output.txt。在END块内,我试图将传递给awk脚本的字符串写入file_output.txt

当运行bash脚本时,我得到以下输出:

a

b

c

d

awk: cmd. line:1: fatal: cannot open file 'c' for reading (No such file or directory)

显然,根据错误,我没有理解awk如何处理命令行参数。我假设在这行中:

awk -v input=${string} -f post.awk < file_input.txt > file_output.txt

$1将是input,$2将是file_input.txt,$3将是file_output.txt。

有人能指出我错放了哪里吗?

TIA

英文:

I am a beginner in awk and was looking at tweaking some bash script at work that interfaces with awk utility to parse the output of some system files. I made my changes based on my understanding and then better sense prevailed, and so I thought I would write a simple test program to try out the same logical pattern. But to my disappointment, the test program didn't work as expected. I am reproducing below the bash script and the awk utility script that I had coded, along with the error that I am getting. Appreciate your thoughts.

bash script
-----------

#!/bin/bash

string=$&#39;a\nb\nc\nd\ne&#39;
echo &quot;$string&quot;

awk -v input=${string} -f post.awk &lt; file_input.txt &gt; file_output.txt


awk script
----------

#!/bin/gawk

BEGIN {
  
  getline tmp
  print tmp &gt; $3

}

END{

 print $1 &gt; $3

}

In the awk script, I am trying to read the single line &quot;Only one line&quot; inside file_input.txt, and write it to file_output.txt inside the BEGIN block. Inside the END block, I am trying to write the string passed on command-line to the awk script, to file_output.txt.

I get the following output upon running the bash script:

a

b

c

d

awk: cmd. line:1: fatal: cannot open file &#39;c&#39; for reading (No such file or directory)

Obviously, I haven't got how awk processes the command-line arguments based on the error. I was assuming that in the line

awk -v input=${string} -f post.awk &lt; file_input.txt &gt; file_output.txt

$1 would be input
$2 would be file_input.txt
$3 would be file_output.txt

Can someone point out where I have misplaced my assumptions?

TIA

答案1

得分: 2

以下是翻译好的部分:

  1. 你忘记引用${string},导致awk的行为混乱,它试图读取文件c。由于引用错误,你实际上尝试执行:

    $ awk -v input=a b c d -f post.awk < file_input.txt > file_output.txt
    

    在这里,你设置了变量input=a,并尝试读取3个文件(bcd)。除非你在post.awk中主动处理/dev/stdin,否则file_input.txt对awk脚本没有影响。

  2. 你犯了一个错误,认为awk代码中的$n表示传递给awk的第n个参数。这是不正确的。在awk中,$n表示当前输入记录的第n个字段。

  3. BEGIN块中,没有定义输入记录$0。只有在定义了输入记录时,才会定义字段($i,i>0)。另一方面,END块知道从输入文件中读取的最后一个输入记录。

你可以在BEGIN块中使用getline来定义输入记录,但不要使用getline var,因为这不会定义$0

那么,我们现在如何让它工作呢?

如果你只依赖于简单的bash,它将按照你的期望方式工作。也就是说,让bash使用重定向来定义所使用命令的/dev/stdin/dev/stdout。例如:

$ binary < f1 > f2

在这里,可执行文件binary将以/dev/stdin指向f1/dev/stdout指向f2的方式执行。

因此,你可以做同样的事情,让你的awk程序只假设默认的/dev/stdin/dev/stdout

# post.awk
BEGIN { getline tmp; print tmp }
END   { print input }

然后执行以下命令:

$ awk -v input="${string}" -f post.awk file_input.txt > file_output.txt

应该可以解决问题。

英文:

There are many things wrong with your code:

  1. You forget to quote ${string} leading to the confusing behaviour that awk wants to read the file c. Due to the missquoteing, you actually try to execute:

    $ awk -v input=a b c d -f post.awk &lt; file_input.txt &gt; file_output.txt
    

    Here you set the variable input=a and try to read 3 files (b, c and d). The file file_input.txt has no effect on the awk script, unless you actively process /dev/stdin in post.awk.

  2. You make the mistake and believe that $n in the awk code, represents the _n_th argument passed to awk. This is not true. In awk, $n represents the n_th field of the current input record.

  3. In the BEGIN block, there is no input record $0 defined. Only when an input record is defined, there are fields defined ($i, i&gt;0). The END block, on the other hand has knowledge of the last input-record read from the input file.

You can define an input record in the BEGIN block by using getline as is, but not getline var as this does not define $0.

So, how can we make this work now.

If you just rely on simple bash, it will work the way you want. That is to say, let bash define, using redirections, what /dev/stdin and /dev/stdout is of the used command. Example:

$ binary &lt; f1 &gt; f2

Here, the executable binary is executed with /dev/stdin pointing to f1 and /dev/stdout pointing to f2.

So you can do the same, and write your awk program to just assume default /dev/stdin and /dev/stdout.

# post.awk
BEGIN { getline tmp; print tmp }
END   { print input }

and executing this as:

$ awk -v input=&quot;${string}&quot; -f post.awk file_input.txt &gt; file_output.txt

should do the trick.

huangapple
  • 本文由 发表于 2023年7月10日 15:04:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76651389.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定