为什么在awk中的”and”和”&&”运算符会导致我的输入文件中的值发生改变?

huangapple go评论70阅读模式
英文:

Why does the and && operator in awk end up changing values in my input file?

问题

以下是您提供的内容的翻译部分:

快速问题,我正在尝试检查我的逗号分隔的CSV文件/数据中的特定模式,我需要使用AND运算符&&。我选择使用AWK来处理。

尝试的命令如下:

1:

cat file | awk -F "," 'BEGIN{ OFS=FS=",";}
                       {if (($1="DIFF" && $2="DIFF" && $3="NODIFF"))
                          $(NF+1)="TRUE";
                        else $(NF+1)="FALSE";
                          print
                       }'  > outfile

2:

cat file | awk -F "," 'BEGIN{ OFS=FS=",";}
                       {if (($1="DIFF" && $2="DIFF" && $3="NODIFF"))
                          $(NF+1)="TRUE";
                        else if (!($1="DIFF" && $2="DIFF" && $3="NODIFF"))
                           $(NF+1)="FALSE";print
                        }' > outfile

输入:

DIFF,DIFF,NODIFF
DIFF,NODIFF,DIFF
DIFF,DIFF,DIFF
NODIFF,DIFF,DIFF

期望的输出:

DIFF,DIFF,NODIFF,TRUE
DIFF,NODIFF,DIFF,FALSE
DIFF,DIFF,DIFF,FALSE
NODIFF,DIFF,DIFF,FALSE

然而,这是我得到的输出,我不知道如何阻止AWK这样做??

1,1,NODIFF,TRUE
1,1,NODIFF,TRUE
1,1,NODIFF,TRUE
1,1,NODIFF,TRUE

我以为这是一个简单的方法,但显然我漏掉了什么。

感谢任何想法/建议,

英文:

Quick question, I am trying to check for certain patterns in my comma-delimited CSV files/data and I need the and operator &&. I choose to work with AWK.

Commands tried:

1:

cat file | awk -F "," 'BEGIN{ OFS=FS=","}
                       {if (($1="DIFF" && $2="DIFF" && $3="NODIFF"))
                          $(NF+1)="TRUE";
                        else $(NF+1)="FALSE";
                          print
                       }'  > outfile

2:

cat file | awk -F "," 'BEGIN{ OFS=FS=","}
                       {if (($1="DIFF" && $2="DIFF" && $3="NODIFF"))
                          $(NF+1)="TRUE";
                        else if (!($1="DIFF" && $2="DIFF" && $3="NODIFF"))
                           $(NF+1)="FALSE";print
                        }' > outfile

Input:

DIFF,DIFF,NODIFF
DIFF,NODIFF,DIFF
DIFF,DIFF,DIFF
NODIFF,DIFF,DIFF

Desired output:

DIFF,DIFF,NODIFF,TRUE
DIFF,NODIFF,DIFF,FALSE
DIFF,DIFF,DIFF,FALSE
NODIFF,DIFF,DIFF,FALSE

However, this is the output that I get and I don't know how to stop AWK from doing this??

1,1,NODIFF,TRUE
1,1,NODIFF,TRUE
1,1,NODIFF,TRUE
1,1,NODIFF,TRUE

I thought this was a simple way to do it but apparently I am missing something.

Thanks for any idea/suggestions,

答案1

得分: 2

$1="DIFF" 是一个赋值操作。

比较运算符是 ==;尝试 $1=="DIFF"$2=="DIFF"$3=="NODIFF"

至于如何生成一对 1...

  • awk 在解析赋值时出现问题,它认为它看起来像 $1=("DIFF" && $2="DIFF")
  • 赋值被视为 'true'(也就是 1),所以 $1=("DIFF" && $2="DIFF") 变成了 $1=("DIFF" && 1)
  • 一个独立的文字字符串被视为 'true'(也就是 1),所以 $1=("DIFF" && 1) 变成了 $1=(1 && 1),然后变成了 $1=1
  • $2=("DIFF" && $3="NODIFF") 重复相同步骤
  • 显然 $3="NODIFF" 是一个更简单、直接的赋值
  • 最终结果:所有输出行都以 1,1,NODIFF 开头
英文:

$1="DIFF" is an assignment operation.

The comparison operator is ==; try $1=="DIFF", $2=="DIFF" and $3=="NODIFF"

As for how you generate a pair of 1's ...

  • awk is having problems parsing the assignment which it thinks looks like $1=("DIFF" && $2="DIFF")
  • an assignment is considered 'true' (aka 1) so $1=("DIFF" && $2="DIFF") becomes $1=("DIFF" && 1)
  • a literal string, all by itself, is considered 'true' (aka 1) so $1=("DIFF" && 1) becomes $1=(1 && 1) which in turn becomes $1=1
  • repeat for $2=("DIFF" && $3="NODIFF")
  • obviously $3="NODIFF" is a more straightforward, simple assignment
  • net result: all output lines start with 1,1,NODIFF

答案2

得分: 0

不需要使用 cat,当有文件时,GNU AWK 可以自行读取。您试图同时使用 -F 来设置字段分隔符并设置 BEGIN 中的 FS 值,严格来说这不是错误,但这是多余的。您混淆了 =(赋值)和 ==(等于检查),导致了功能失效。

在这种特定情况下,您可以使用以下正则表达式完成任务,假设 file.txt 的内容如下:

DIFF,DIFF,NODIFF
DIFF,NODIFF,DIFF
DIFF,DIFF,DIFF
NODIFF,DIFF,DIFF

然后,执行以下命令:

awk 'BEGIN{FS=OFS=","}{$(NF+1)=/^DIFF,DIFF,NODIFF/? "TRUE" : "FALSE"; print}' file.txt

将输出:

DIFF,DIFF,NODIFF,TRUE
DIFF,NODIFF,DIFF,FALSE
DIFF,DIFF,DIFF,FALSE
NODIFF,DIFF,DIFF,FALSE

解释:我告诉 GNU AWK 逗号是字段分隔符和输出字段分隔符,然后我设置下一个字段为 TRUE,如果行以 (^) DIFF,DIFF,NODIFF 开头,否则为 FALSE,使用所谓的三元运算符条件 ? 值为真 : 值为假,然后 print 行。

(在 GNU Awk 5.1.0 中测试过)

英文:

You do not need cat when you have file, GNU AWK can read on its own. You are attempting to set field separator both using -F and setting value of FS in BEGIN, strictly speaking this is not error, however this is redundant. You mixed = (assign) with == (equal check) which caused malfunction.

In this particular case you might get task done using just one regular expression following way, let file.txt content be

DIFF,DIFF,NODIFF
DIFF,NODIFF,DIFF
DIFF,DIFF,DIFF
NODIFF,DIFF,DIFF

then

awk 'BEGIN{FS=OFS=","}{$(NF+1)=/^DIFF,DIFF,NODIFF/?"TRUE":"FALSE";print}' file.txt

gives output

DIFF,DIFF,NODIFF,TRUE
DIFF,NODIFF,DIFF,FALSE
DIFF,DIFF,DIFF,FALSE
NODIFF,DIFF,DIFF,FALSE

Explanation: I inform GNU AWK that , is both field separator and output field separator, then I set next field to TRUE if line starts with (^) DIFF,DIFF,NODIFF else FALSE using so-called ternary operator condition?valueiftrue:valueiffalse and then print line.

(tested in GNU Awk 5.1.0)

huangapple
  • 本文由 发表于 2023年6月16日 00:16:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76483639.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定