英文:
Using awk to calculate a csv but with uneven columns
问题
我需要使用扩展名为 .awk 的脚本来在文件中进行计算。每行包含数字,最后一个位置包含一个关键字,用于决定是否对该行中的数字进行加法或减法运算。然而,由于关键字不在同一列中,我无法弄清楚如何正确解析。表格如下所示:
1,2,5,7,14,11,51,ADD
1,3,5,SUB
15,13,11,19,ADD
19,13,12,22,SUB
1,5,8,2,0,13,18,22,6,4,7,ADD
11,3,SUB
任何帮助都将不胜感激。
英文:
I need to write a script using the .awk extension to do calculations within a file. Each row contains numbers and a keyword in the last spot dictating whether to add or subtract the numbers within a row. However, since the keywords are not in the same column, I cannot figure out how to parse through this correctly. The table looks like:
1,2,5,7,14,11,51,ADD
1,3,5,SUB
15,13,11,19,ADD
19,13,12,22,SUB
1,5,8,2,0,13,18,22,6,4,7,ADD
11,3,SUB
Any help would be appreciated.
答案1
得分: 3
以下是翻译好的部分:
你说你需要编写一个使用 awk
脚本的 awk
扩展名 -- 我理解为 yourscript.awk
。awk
相对简单,只要你理解 记录(行)、字段(列)和规则(应用于每行输入的 awk
命令 -- 按你编写的顺序)。
规则 包含在 awk
脚本的外部 { ... }
中。有两个特殊的规则,BEGIN
(在处理第一个记录之前运行 - 用于设置、输出标题等)和 END
规则(在处理最后一个记录之后运行,比如输出所有记录值的总和、平均值或在处理最后一个记录后进行输出处理)。
在你的情况下,你只需要使用 BEGIN
来设置 FS
(字段分隔符)的值为 ','
,这样你的字段就会以 ','
分隔。
设置了 FS
之后,你可以使用特殊变量 NF
(字段数量)来确定每行中的字段数(以及哪个字段包含 "ADD"
或 "SUB"
)。如果你想知道正在处理的文件中的当前行,内部变量 FNR
可以提供这个信息。
有了这个背景,你可以处理你的文件,根据 $NF
($
表示字段值,例如如果 NF == 3
,那么 $NF
是第三个字段的值)来将所有值要么 "ADD"
要么 "SUB"
(减去)。
更简单的方法是从第二个字段开始循环,要么加到第一个字段($1
)上,要么从第一个字段减去,以开始每条记录的计算。获取总和并输出它,重复处理文件中的每条记录,例如:
#!/bin/awk -f
BEGIN { FS = "," } ## 初始化 FS 为 ","
{
n = $1 ## 初始化 n 为第一个字段的值
for (i = 2; i < NF; i++) { ## 循环字段2到NF - 1
if ($NF == "SUB") ## 如果 $NF 是 "SUB"
n -= $i ## 从总数中减去当前值
else if ($NF == "ADD") ## 否则如果 $NF 是 "ADD"
n += $i ## 加上当前值
}
printf "line: %d => % 3d\n", FNR, n ## 输出总数
}
示例用法/输出
使用你的示例数据在文件 dat/keyword.dat
中,并将脚本命名为 keyword.awk
(使用 chmod +x keyword.awk
使文件可执行),你可以这样做:
$ ./keyword.awk dat/keyword.txt
line: 1 => 91
line: 2 => -7
line: 3 => 58
line: 4 => -28
line: 5 => 86
line: 6 => 8
注意: 如果你的 ADD
或 SUB
计算需要不同的处理,你可以根据需要简单地调整上面的求和逻辑。
如果有问题,请告诉我。
英文:
You say you need to write an awk
script with the awk
extension -- which I take to mean yourscript.awk
. awk
is fairly simple, as long as you understand Records (lines), Fields (columns) and Rules (the awk
commands applied to each line of input -- in the order you write them)
A Rule is contained between outer { ... }
within the awk
script. There are two special rules BEGIN
(runs before the 1st records is processed - for setup, header output, etc..) and the END
rule (which runs after the last record is processed (like for outputting total sums of all record values, averages, or processing output after the last record)
In your case you simply need BEGIN
to set FS
(the Field-Seperator) value to ','
so your fields are spit on ','
.
After you set FS
, you can use the special variable NF
(number of fields) to determine the number of fields in each row (and which field holds "ADD"
or "SUB"
). If you want to know the current line in the file being processed, the FNR
internal variable gives you that info.
With that as a background, you can process your file and either "ADD"
all values or "SUB"
(substract) all values based on what is in $NF
(the $
denotes field-value, e.g. if NF == 3
, then $NF
is the value of the 3rd field)
It is simpler to loop from the 2nd field on, either adding to, or subtracting from, the first field ($1
) to start your calculation for each record. Get the total and output it, repeat for each record in the file, e.g.
#!/bin/awk -f
BEGIN { FS = "," } ## initialize FS to ","
{
n = $1 ## initialize n to 1st field value
for (i = 2; i < NF; i++) { ## loop fields 2 until NF - 1
if ($NF == "SUB") ## if $NF is "SUB"
n -= $i ## subtract current from total
else if ($NF == "ADD") ## otherwise if $NF is "ADD"
n += $i ## add current to total
}
printf "line: %d => % 3d\n", FNR, n ## output total
}
Example Use/Output
With your sample data in the file dat/keyword.dat
and the script named keyword.awk
(and chmod +x keyword.awk
to make the file executable), you would do:
$ ./keyword.awk dat/keyword.txt
line: 1 => 91
line: 2 => -7
line: 3 => 58
line: 4 => -28
line: 5 => 86
line: 6 => 8
NOTE: if your ADD
or SUB
calculations need to be done differently, you can simply adjust the summing logic above as needed.
Let me know if you have questions.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论