awk如何在最后一列包含句子时打印从第4列到行末的内容

huangapple go评论64阅读模式
英文:

awk how to print from column 4 to the end of line when last column contains a sentence

问题

我想打印第2列和第3列,但第3列包含有空格的句子。我希望我的输出看起来像这样:

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

但是我无法使最后一列打印完整的句子。我只得到最后一个词,像这样:

2017 condition
2013 tire
2017 change

我知道为什么会发生这种情况。这是因为awk将空格视为列分隔符,NF仅返回每行的最后一列,即最后一个单词。

如何使用awk获取所需的输出?

英文:

If I have file fileA.txt containing lines like this:

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

I would like to print only column 2 and 3 but column 3 contains sentence with spaces. I would like my output to look like

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

But I cannot make the last column print full sentence. I get only last word like:

awk '{print $2,$NF}' fileA.txt 

This outputs

2017 condition
2013 tire
2017 change 

I know why this is happening. It is because awk treats spaces as column separators and NF returns simply only last column for each line which is last word.

How do I get desired output using awk?

答案1

得分: 2

我将按以下方式使用GNU AWK,假设file.txt的内容如下:

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

然后执行以下命令:

awk '{$1=""; print substr($0, 2)}' file.txt

将会得到以下输出:

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

解释:我通过将第一个字段($1)的值设置为空字符串来清除它,然后使用substr函数从第2个字符开始print其余部分,以删除前导空格。

(在GNU Awk 5.1.0中测试过)

英文:

I would harness GNU AWK following way, let file.txt content be

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

then

awk '{$1="";print substr($0,2)}' file.txt

gives output

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

Explanation: I clear 1st field ($1) by setting its' value to empty string then print rest starting at 2nd character using substr function in order to remove leading space.

(tested in GNU Awk 5.1.0)

答案2

得分: 2

只有将空字符串分配给field-1的缺点是它会导致重新计算字段,从原始记录中删除任何额外的空格。您可以通过使用sub()match,然后使用带有RSTARTsubstr()来保留原始记录格式(可能/可能不需要)。

例如:

$ awk '{sub ($1 FS, "", $0)}1' file

(用空字符串替换字段-1和字段分隔符)

或者

$ awk '{match ($0,$2); print substr($0,RSTART)}' file

(使用match()来获取字段-2的开始位置,并使用substr()打印)

示例用法/输出

对于您在file中的示例数据,您将收到以下输出:

$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

有关sub()match()substr()的详细用法,请参阅GNU Awk用户指南 - 字符串函数。(请注意,与sub()gsub()等相比,match()的参数顺序有所不同)。如果您有疑问,请告诉我。

英文:

Only downside to assigning an empty-string to field-1 is that it will cause a recalculation of the fields removing any additional whitespace from the original record. You can preserve the original record format (which may/may not be required) by using either sub() or match and then substr() with RSTART (filled by `match() containing the position of the start of the 2nd field).

For example:

$ awk '{sub ($1 FS, "", $0)}1' file

(substitute the empty-string for the field-1 and a field-separator)

or

$ awk '{match ($0,$2); print substr($0,RSTART)}' file

(use match() to obtain the start of field-2 and print with substr())

Example Use/Output

With your sample data in file you would receive the following:

$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

and

$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

See GNU Awk User's Guide - String Functions for detailed usage of sub(), match() and substr(). (and note the difference in parameter order for match() compared with sub(), gsub(), etc..) Let me know if you have questions.

答案3

得分: 2

以下是您要翻译的代码部分:

$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

如果您需要进一步的翻译或有其他问题,请随时提问。

英文:
$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

Any of the above will work no matter which values are in $1 or $2 or anywhere else in the input.

If you wanted to separate each line into 4 fields you could do this with GNU awk for the 3rd arg to match() and \S:

$ awk -v OFS='\t' '
    match($0,/(\S+) (\S+) (\S+) (.*)/,a) {
        print a[1], a[2], a[3], a[4]
    }
' file
Toyota  2017    Corola  Good condition
Honda   2013    Civic   Flat back right tire
Jeep    2017    Wrangler        Roof is leaking and oil needs change

and then you can, of course, print whichever fields you like.

huangapple
  • 本文由 发表于 2023年8月5日 10:26:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76839930.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定