英文:
awk how to print from column 4 to the end of line when last column contains a sentence
问题
我想打印第2列和第3列,但第3列包含有空格的句子。我希望我的输出看起来像这样:
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
但是我无法使最后一列打印完整的句子。我只得到最后一个词,像这样:
2017 condition
2013 tire
2017 change
我知道为什么会发生这种情况。这是因为awk将空格视为列分隔符,NF仅返回每行的最后一列,即最后一个单词。
如何使用awk获取所需的输出?
英文:
If I have file fileA.txt containing lines like this:
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
I would like to print only column 2 and 3 but column 3 contains sentence with spaces. I would like my output to look like
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
But I cannot make the last column print full sentence. I get only last word like:
awk '{print $2,$NF}' fileA.txt
This outputs
2017 condition
2013 tire
2017 change
I know why this is happening. It is because awk treats spaces as column separators and NF returns simply only last column for each line which is last word.
How do I get desired output using awk?
答案1
得分: 2
我将按以下方式使用GNU AWK
,假设file.txt
的内容如下:
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
然后执行以下命令:
awk '{$1=""; print substr($0, 2)}' file.txt
将会得到以下输出:
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
解释:我通过将第一个字段($1
)的值设置为空字符串来清除它,然后使用substr
函数从第2个字符开始print
其余部分,以删除前导空格。
(在GNU Awk 5.1.0中测试过)
英文:
I would harness GNU AWK
following way, let file.txt
content be
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
then
awk '{$1="";print substr($0,2)}' file.txt
gives output
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
Explanation: I clear 1st field ($1
) by setting its' value to empty string then print
rest starting at 2nd character using substr
function in order to remove leading space.
(tested in GNU Awk 5.1.0)
答案2
得分: 2
只有将空字符串分配给field-1的缺点是它会导致重新计算字段,从原始记录中删除任何额外的空格。您可以通过使用sub()
或match
,然后使用带有RSTART
的substr()
来保留原始记录格式(可能/可能不需要)。
例如:
$ awk '{sub ($1 FS, "", $0)}1' file
(用空字符串替换字段-1和字段分隔符)
或者
$ awk '{match ($0,$2); print substr($0,RSTART)}' file
(使用match()
来获取字段-2的开始位置,并使用substr()
打印)
示例用法/输出
对于您在file
中的示例数据,您将收到以下输出:
$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
和
$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
有关sub()
、match()
和substr()
的详细用法,请参阅GNU Awk用户指南 - 字符串函数。(请注意,与sub()
、gsub()
等相比,match()
的参数顺序有所不同)。如果您有疑问,请告诉我。
英文:
Only downside to assigning an empty-string to field-1 is that it will cause a recalculation of the fields removing any additional whitespace from the original record. You can preserve the original record format (which may/may not be required) by using either sub()
or match
and then substr()
with RSTART
(filled by `match() containing the position of the start of the 2nd field).
For example:
$ awk '{sub ($1 FS, "", $0)}1' file
(substitute the empty-string for the field-1 and a field-separator)
or
$ awk '{match ($0,$2); print substr($0,RSTART)}' file
(use match()
to obtain the start of field-2 and print with substr()
)
Example Use/Output
With your sample data in file
you would receive the following:
$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
and
$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
See GNU Awk User's Guide - String Functions for detailed usage of sub()
, match()
and substr()
. (and note the difference in parameter order for match()
compared with sub()
, gsub()
, etc..) Let me know if you have questions.
答案3
得分: 2
以下是您要翻译的代码部分:
$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
如果您需要进一步的翻译或有其他问题,请随时提问。
英文:
$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
Any of the above will work no matter which values are in $1 or $2 or anywhere else in the input.
If you wanted to separate each line into 4 fields you could do this with GNU awk for the 3rd arg to match()
and \S
:
$ awk -v OFS='\t' '
match($0,/(\S+) (\S+) (\S+) (.*)/,a) {
print a[1], a[2], a[3], a[4]
}
' file
Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change
and then you can, of course, print whichever fields you like.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论