如何更新 Awk 命令以使其与更新版本的 MacOS 兼容?

huangapple go评论61阅读模式
英文:

How can I update an Awk command to work with a newer version of MacOS?

问题

抱歉,我无法理解你的请求。如果你有其他需要,请随时告诉我。

英文:

I am using the following Awk command which is based on this Stack Exchange post:

tail -n +2 *.csv | sort -t',' -k2 | awk -F',' '$2~/^[[:space:]]*$/{next} {sub(/\r$/,"")} $2!=prev{close(out); out=$2".txt"; prev=$2} {print $1 > out}'

The command works perfectly under MacOS 10.14. However, I recently upgraded to MacOS 12.6 and it no longer works. (MacOS 12.6 uses awk version 20200816).

It produces the following error:

awk: newline in regular expression ... at source line 1
 context is
	$2~/^[[:space:]]*$/{next} {sub(/ >>> 
 <<< 
awk: syntax error at source line 1
awk: illegal statement at source line 1

How can I get it working again and ideally (if possible) make it more future proof, without having to install any extra software. I looked at the changes made to awk, but can't find anything that would cause it to stop working.

<hr>

Background

The command takes all CSV files in a directory. It splits the file into text files according to the values of the second column of the CSV file while only keeping the values stored in the first column.

Example CSV file:

COLUMN 1,COLUMN 2
innovation &quot;is essential&quot;,3-Entrepreneurship
countless,
innocent,2-Police
toilet handle,2-Bathroom
n&#233;e dresses,3-Companies
odorless,2-Sense of Smell
old ideas &quot;new takes&quot;,3-Entrepreneurship
new income streams,3-Entrepreneurship
Zo&#235;’s food store,3-Companies
many,
crime &quot;doesn&#39;t sleep&quot;,2-Police
bath room,2-Bathroom
ring,
m&#243;v&#237;l r&#233;sum&#233;s,3-Companies
musty smell&#39;s come here,2-Sense of Smell
good publicity guru,3-Entrepreneurship
Se&#241;or,3-Companies

E.g. after split

In file 3-Entrepreneurship.txt

innovation &quot;is essential&quot;
old ideas &quot;new takes&quot;
new income streams
good publicity guru

In file 2-Bathroom.txt

toilet handle
bath room

In file 2-Police.txt

innocent
crime &quot;doesn&#39;t sleep&quot;

In file 2-Sense of Smell.txt

odorless
musty smell&#39;s come here

In file 3-Companies.txt

n&#233;e dresses
Zo&#235;’s food store
m&#243;v&#237;l r&#233;sum&#233;s
Se&#241;or

答案1

得分: 1

以下是翻译好的内容:

我大约在3年前发布的解决方案仍然有效:

生成的文件在运行之前不得存在

awk -F, 'FNR>1 && $2 {print $1 >> ($2 ".txt"); close($2 ".txt")}' file.csv

生成:

$ head *.txt
==> 2-Bathroom.txt <==
卫生间把手
浴室

==> 2-Police.txt <==
无辜
犯罪“不休息”

==> 2-Sense of Smell.txt <==
无味
发霉的气味来这里

==> 3-Companies.txt <==
出生名字的衣服
佐伊的食品店
移动简历
先生

==> 3-Entrepreneurship.txt <==
创新“是必不可少的”
旧思想“新的方法”
新的收入来源
良好的宣传大师

或者,这里有一个Ruby版本:

ruby -r csv -e '
CSV.parse($<.read, **{:headers=>true, :liberal_parsing=>true}).
select{|r| r["COLUMN 2"]}.
group_by{|r| r["COLUMN 2"]}.
each{|k,v| File.write("#{k}.txt", v.map(&:first).map(&:last).join("\n"))
}
' file.csv

相同的输出

英文:

The solution I posted nearly 3 years ago still works:

# the files produced must not exist prior to the run
awk -F, &#39;FNR&gt;1 &amp;&amp; $2 {print $1 &gt;&gt; ($2 &quot;.txt&quot;); close($2 &quot;.txt&quot;)}&#39; file.csv

Produces:

$ head *.txt
==&gt; 2-Bathroom.txt &lt;==
toilet handle
bath room

==&gt; 2-Police.txt &lt;==
innocent
crime &quot;doesnt sleep&quot;

==&gt; 2-Sense of Smell.txt &lt;==
odorless
musty smells come here

==&gt; 3-Companies.txt &lt;==
n&#233;e dresses
Zo&#235;’s food store
m&#243;v&#237;l r&#233;sum&#233;s
Se&#241;or

==&gt; 3-Entrepreneurship.txt &lt;==
innovation &quot;is essential&quot;
old ideas &quot;new takes&quot;
new income streams
good publicity guru

Or, here is a Ruby:

ruby -r csv -e &#39;
CSV.parse($&lt;.read, **{:headers=&gt;true, :liberal_parsing=&gt;true}).
    select{|r| r[&quot;COLUMN 2&quot;]}.
    group_by{|r| r[&quot;COLUMN 2&quot;]}.
    each{|k,v| File.write(&quot;#{k}.txt&quot;, v.map(&amp;:first).map(&amp;:last).join(&quot;\n&quot;)) 
}
&#39; file.csv
# same output

答案2

得分: 0

"Looks like it's treating the \r as a literal linefeed (possible issue with using smart quotes?).

You might try, say, replacing \r with \x0d to see if that makes a difference."

英文:

Looks like it's treating the \r as a literal linefeed (possible issue with using smart quotes?).

You might try, say, replacing \r with \x0d to see if that make a difference.

huangapple
  • 本文由 发表于 2023年5月13日 22:51:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76243328.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定