英文:
Interfacing bash with awk:
问题
我是一个awk的初学者,正在尝试调整一些与awk实用程序接口的工作中的一些bash脚本,以解析一些系统文件的输出。我根据自己的理解进行了更改,后来我觉得最好还是写一个简单的测试程序来尝试相同的逻辑模式。但令我失望的是,测试程序并没有按预期工作。我在下面重现了我编写的bash脚本和awk实用程序脚本,以及我遇到的错误。感谢您的想法。
bash脚本
-----------
#!/bin/bash
string=$'a\nb\nc\nd\ne'
echo "$string"
awk -v input=${string} -f post.awk < file_input.txt > file_output.txt
awk脚本
----------
#!/bin/gawk
BEGIN {
getline tmp
print tmp > $3
}
END{
print $1 > $3
}
在awk脚本中,我试图在BEGIN
块内读取file_input.txt
中的单行"Only one line"
,并将其写入file_output.txt
。在END
块内,我试图将传递给awk脚本的字符串写入file_output.txt
。
当运行bash脚本时,我得到以下输出:
a
b
c
d
awk: cmd. line:1: fatal: cannot open file 'c' for reading (No such file or directory)
显然,根据错误,我没有理解awk如何处理命令行参数。我假设在这行中:
awk -v input=${string} -f post.awk < file_input.txt > file_output.txt
$1将是input,$2将是file_input.txt,$3将是file_output.txt。
有人能指出我错放了哪里吗?
TIA
英文:
I am a beginner in awk and was looking at tweaking some bash script at work that interfaces with awk utility to parse the output of some system files. I made my changes based on my understanding and then better sense prevailed, and so I thought I would write a simple test program to try out the same logical pattern. But to my disappointment, the test program didn't work as expected. I am reproducing below the bash script and the awk utility script that I had coded, along with the error that I am getting. Appreciate your thoughts.
bash script
-----------
#!/bin/bash
string=$'a\nb\nc\nd\ne'
echo "$string"
awk -v input=${string} -f post.awk < file_input.txt > file_output.txt
awk script
----------
#!/bin/gawk
BEGIN {
getline tmp
print tmp > $3
}
END{
print $1 > $3
}
In the awk script, I am trying to read the single line "Only one line"
inside file_input.txt
, and write it to file_output.txt
inside the BEGIN
block. Inside the END
block, I am trying to write the string passed on command-line to the awk script, to file_output.txt
.
I get the following output upon running the bash script:
a
b
c
d
awk: cmd. line:1: fatal: cannot open file 'c' for reading (No such file or directory)
Obviously, I haven't got how awk processes the command-line arguments based on the error. I was assuming that in the line
awk -v input=${string} -f post.awk < file_input.txt > file_output.txt
$1 would be input
$2 would be file_input.txt
$3 would be file_output.txt
Can someone point out where I have misplaced my assumptions?
TIA
答案1
得分: 2
以下是翻译好的部分:
-
你忘记引用
${string}
,导致awk的行为混乱,它试图读取文件c
。由于引用错误,你实际上尝试执行:$ awk -v input=a b c d -f post.awk < file_input.txt > file_output.txt
在这里,你设置了变量
input=a
,并尝试读取3个文件(b
、c
和d
)。除非你在post.awk
中主动处理/dev/stdin
,否则file_input.txt
对awk脚本没有影响。 -
你犯了一个错误,认为awk代码中的
$n
表示传递给awk的第n个参数。这是不正确的。在awk中,$n
表示当前输入记录的第n个字段。 -
在
BEGIN
块中,没有定义输入记录$0
。只有在定义了输入记录时,才会定义字段($i,i>0
)。另一方面,END
块知道从输入文件中读取的最后一个输入记录。
你可以在BEGIN
块中使用getline
来定义输入记录,但不要使用getline var
,因为这不会定义$0
。
那么,我们现在如何让它工作呢?
如果你只依赖于简单的bash,它将按照你的期望方式工作。也就是说,让bash使用重定向来定义所使用命令的/dev/stdin
和/dev/stdout
。例如:
$ binary < f1 > f2
在这里,可执行文件binary
将以/dev/stdin
指向f1
,/dev/stdout
指向f2
的方式执行。
因此,你可以做同样的事情,让你的awk程序只假设默认的/dev/stdin
和/dev/stdout
。
# post.awk
BEGIN { getline tmp; print tmp }
END { print input }
然后执行以下命令:
$ awk -v input="${string}" -f post.awk file_input.txt > file_output.txt
应该可以解决问题。
英文:
There are many things wrong with your code:
-
You forget to quote
${string}
leading to the confusing behaviour that awk wants to read the filec
. Due to the missquoteing, you actually try to execute:$ awk -v input=a b c d -f post.awk < file_input.txt > file_output.txt
Here you set the variable
input=a
and try to read 3 files (b
,c
andd
). The filefile_input.txt
has no effect on the awk script, unless you actively process/dev/stdin
inpost.awk
. -
You make the mistake and believe that
$n
in the awk code, represents the _n_th argument passed to awk. This is not true. In awk,$n
represents the n_th field of the current input record. -
In the
BEGIN
block, there is no input record$0
defined. Only when an input record is defined, there are fields defined ($i, i>0
). TheEND
block, on the other hand has knowledge of the last input-record read from the input file.
You can define an input record in the BEGIN block by using getline
as is, but not getline var
as this does not define $0
.
So, how can we make this work now.
If you just rely on simple bash, it will work the way you want. That is to say, let bash define, using redirections, what /dev/stdin
and /dev/stdout
is of the used command. Example:
$ binary < f1 > f2
Here, the executable binary
is executed with /dev/stdin
pointing to f1
and /dev/stdout
pointing to f2
.
So you can do the same, and write your awk program to just assume default /dev/stdin
and /dev/stdout
.
# post.awk
BEGIN { getline tmp; print tmp }
END { print input }
and executing this as:
$ awk -v input="${string}" -f post.awk file_input.txt > file_output.txt
should do the trick.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论