2023年8月5日 10:26:03go评论91阅读模式

英文:

awk how to print from column 4 to the end of line when last column contains a sentence

问题

我想打印第2列和第3列，但第3列包含有空格的句子。我希望我的输出看起来像这样：

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

但是我无法使最后一列打印完整的句子。我只得到最后一个词，像这样：

2017 condition
2013 tire
2017 change

我知道为什么会发生这种情况。这是因为awk将空格视为列分隔符，NF仅返回每行的最后一列，即最后一个单词。

如何使用awk获取所需的输出？

英文:

If I have file fileA.txt containing lines like this:

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

I would like to print only column 2 and 3 but column 3 contains sentence with spaces. I would like my output to look like

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

But I cannot make the last column print full sentence. I get only last word like:

awk &#39;{print $2,$NF}&#39; fileA.txt

This outputs

2017 condition
2013 tire
2017 change

I know why this is happening. It is because awk treats spaces as column separators and NF returns simply only last column for each line which is last word.

How do I get desired output using awk?

答案1

得分: 2

我将按以下方式使用GNU AWK，假设file.txt的内容如下：

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

然后执行以下命令：

awk '{$1=""; print substr($0, 2)}' file.txt

将会得到以下输出：

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

解释：我通过将第一个字段（$1）的值设置为空字符串来清除它，然后使用substr函数从第2个字符开始print其余部分，以删除前导空格。

（在GNU Awk 5.1.0中测试过）

英文:

I would harness GNU AWK following way, let file.txt content be

Toyota 2017 Corola Good condition
Honda 2013 Civic Flat back right tire
Jeep 2017 Wrangler Roof is leaking and oil needs change

then

awk &#39;{$1=&quot;&quot;;print substr($0,2)}&#39; file.txt

gives output

2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

Explanation: I clear 1st field ($1) by setting its' value to empty string then print rest starting at 2nd character using substr function in order to remove leading space.

(tested in GNU Awk 5.1.0)

答案2

得分: 2

只有将空字符串分配给field-1的缺点是它会导致重新计算字段，从原始记录中删除任何额外的空格。您可以通过使用sub()或match，然后使用带有RSTART的substr()来保留原始记录格式（可能/可能不需要）。

例如：

$ awk '{sub ($1 FS, "", $0)}1' file

（用空字符串替换字段-1和字段分隔符）

或者

$ awk '{match ($0,$2); print substr($0,RSTART)}' file

（使用match()来获取字段-2的开始位置，并使用substr()打印）

示例用法/输出

对于您在file中的示例数据，您将收到以下输出：

$ awk '{sub ($1 FS, "", $0)}1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

和

$ awk '{match ($0,$2); print substr($0,RSTART)}' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

有关sub()、match()和substr()的详细用法，请参阅GNU Awk用户指南 - 字符串函数。（请注意，与sub()、gsub()等相比，match()的参数顺序有所不同）。如果您有疑问，请告诉我。

英文:

Only downside to assigning an empty-string to field-1 is that it will cause a recalculation of the fields removing any additional whitespace from the original record. You can preserve the original record format (which may/may not be required) by using either sub() or match and then substr() with RSTART (filled by `match() containing the position of the start of the 2nd field).

For example:

$ awk &#39;{sub ($1 FS, &quot;&quot;, $0)}1&#39; file

(substitute the empty-string for the field-1 and a field-separator)

$ awk &#39;{match ($0,$2); print substr($0,RSTART)}&#39; file

(use match() to obtain the start of field-2 and print with substr())

Example Use/Output

With your sample data in file you would receive the following:

$ awk &#39;{sub ($1 FS, &quot;&quot;, $0)}1&#39; file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

and

$ awk &#39;{match ($0,$2); print substr($0,RSTART)}&#39; file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

See GNU Awk User's Guide - String Functions for detailed usage of sub(), match() and substr(). (and note the difference in parameter order for match() compared with sub(), gsub(), etc..) Let me know if you have questions.

答案3

得分: 2

以下是您要翻译的代码部分：

$ cut -d' ' -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ sed 's/[^ ]* //' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk '{sub(/[^ ]* /,"")} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk 'match($0," "){$0=substr($0,RSTART+1)} 1' file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

如果您需要进一步的翻译或有其他问题，请随时提问。

英文:

$ cut -d&#39; &#39; -f2- file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ sed &#39;s/[^ ]* //&#39; file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk &#39;{sub(/[^ ]* /,&quot;&quot;)} 1&#39; file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change
$ awk &#39;match($0,&quot; &quot;){$0=substr($0,RSTART+1)} 1&#39; file
2017 Corola Good condition
2013 Civic Flat back right tire
2017 Wrangler Roof is leaking and oil needs change

Any of the above will work no matter which values are in $1 or $2 or anywhere else in the input.

If you wanted to separate each line into 4 fields you could do this with GNU awk for the 3rd arg to match() and \S:

$ awk -v OFS=&#39;\t&#39; &#39;
    match($0,/(\S+) (\S+) (\S+) (.*)/,a) {
        print a[1], a[2], a[3], a[4]
    }
&#39; file
Toyota  2017    Corola  Good condition
Honda   2013    Civic   Flat back right tire
Jeep    2017    Wrangler        Roof is leaking and oil needs change

and then you can, of course, print whichever fields you like.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

awk如何在最后一列包含句子时打印从第4列到行末的内容

问题

答案1

答案2

答案3

How do I combine 3 commands into 1 for regex and awk + want to regex on 3 possibilities e.g. 9,10,11

Remove the last 2 columns from a tsv file如何从tsv文件中删除最后2列？

寻找另一个范围内的最近较高和较低范围

如何分组并显示每组的第n行。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论