awk如何只打印一列,而其他列有多个单词?

huangapple go评论82阅读模式
英文:

awk how to print one column only but other columns have more than one word

问题

这是一个示例数据:

+-------+--------------------+-----------+------------+-----------+-------------+
| ID    | Name               | Status    | Networks   | Image     | Plan        |
+-------+--------------------+-----------+------------+-----------+-------------+
| 1wsd  | HostName A         | PAUSED    | IP=1.1.1.1 | Ubuntu20  | PlanA BGP40 |
| 4fgh  | An other hostname  | ACTIVE    | IP=2.2.2.2 | Ubuntu20  | PlanB BGP30 |
| zxd1  | final.destination  | REBOOTING | IP=3.3.3.3 | Debian11  | PlanA BGP10 |
| 60hn  | no problem         | ACTIVE    | IP=4.4.4.4 | Centos7   | Plan BGP90  |
+-------+--------------------+-----------+------------+-----------+-------------+

我想要打印出Plan列,但是你可以看到,它的列号是不固定的。例如对于1.1.1.1,计划名称是从11到行尾的(假设是从行尾开始,但我们可以删除行尾的|)。

首先,它应该只过滤出Plan列(格式类似于表格),我们可以排除头部的前三行和尾部的最后一行,这样我们只能得到计划名称。

期望的输出只有计划名称:

PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90

我正在搜索,但到目前为止还没有找到解决方法。

英文:

This is a sample data:

+-------+--------------------+-----------+------------+-----------+-------------+
| ID    | Name               | Status    | Networks   | Image     | Plan        |
+-------+--------------------+-----------+------------+-----------+-------------+
| 1wsd  | HostName A         | PAUSED    | IP=1.1.1.1 | Ubuntu20  | PlanA BGP40 |
| 4fgh  | An other hostname  | ACTIVE    | IP=2.2.2.2 | Ubuntu20  | PlanB BGP30 |
| zxd1  | final.destination  | REBOOTING | IP=3.3.3.3 | Debian11  | PlanA BGP10 |
| 60hn  | no problem         | ACTIVE    | IP=4.4.4.4 | Centos7   | Plan BGP90  |
+-------+--------------------+-----------+------------+-----------+-------------+

I want to print only Plan column only, but as you see, it's not a fixed column number. For example for 1.1.1.1, the plan name is in the from 11 to the end (let's suppose from the end, but we can remove the | at the end of line).

First, it should filter only the Plan column (the format is like this and is table-like), and we can exclude the first three lines of header and the one last line of the tail so that we can only have the plan names.

Expected output is only plan names:

PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90

I'm googling but didn't find a way up to now.

答案1

得分: 4

你可以使用以下命令:

$ awk -F '|' 'NR>4 {gsub(/^[[:space:]]+|[[:space:]]+$/,"",v); print v} {v=$(NF-1)}' file
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
  • 将字段分隔符设置为 | (-F '|')
  • 跳过前三行和最后一行。详细信息请参考这里
  • 创建一个变量 v,它保存倒数第二个条目 (v=$(NF-1)),然后从中删除所有前导和尾随空格 (gsub(/^[[:space:]]+|[[:space:]]+$/,"",v))
英文:

You can use:

$ awk -F '|' 'NR>4 {gsub(/^[[:space:]]+|[[:space:]]+$/,"",v); print v} {v=$(NF-1)}' file
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
  • Set field separator to | (-F '|')
  • Skip the first three lines and the last one. See here for how this works
  • Create a variable v which holds the second to last entry (v=$(NF-1)), then remove all leading and trailing whitespace from it (gsub(/^[[:space:]]+|[[:space:]]+$/,"",v))

答案2

得分: 1

如果你对 [tag:sed] 的替代方案感兴趣

sed '
  1,3d                             # 删除前三行
  $d                               # 删除最后一行
  s/[[:blank:]]*|[[:blank:]]*$//   # 移除尾部边界
  s/.*|[[:blank:]]*//              # 消耗所有前导单元格。
' file

在 sed 的基本正则表达式中,| 是一个普通字符。

英文:

If you're interested in a [tag:sed] alternative

sed '
  1,3d                             # delete the first 3 lines
  $d                               # delete the last line
  s/[[:blank:]]*|[[:blank:]]*$//   # remove the trailing border
  s/.*|[[:blank:]]*//              # consume all the leading cells.
' file

With sed's basic regular expressions, | is a plain character.

答案3

得分: 1

使用 sed 简单地进行处理:

  • 首先删除最后一个 空格 + 管道符 |
  • 然后删除所有直到 管道符 + 空格 的内容:.*|
  • 从第三行开始操作到结尾

所以命令将会是:

sed -ne '3,${s/ |[^|]*$//;s/^.*| //p;}'

或者

sed -ne '
    3,${
        s/ |[^|]*$//;
        s/^.*| //p;
    }
'

但是如果字段必须以 plan 开头,事情就变得简单了:

sed -ne 's/.*\(Plan.*[^[:space:]]\)[[:space:]]*|[^|]*$//p'
英文:

using sedsimply:

  • 1st remove last space + pipe: |
  • Then remove all untils pipe + space: .*|
  • Oper from line 3 to end

So command will be just:

sed -ne '3,${s/ |[^|]*$//;s/^.*| //p;}'

Or

sed -ne '
    3,${
        s/ |[^|]*$//;
        s/^.*| //p;
    }
'

But if the field have to begin by plan, things become simplier:

sed -ne 's/.*\(Plan.*[^[:space:]]\)[[:space:]]*|[^|]*$//p'

答案4

得分: 1

这是一个允许按列名选择列的 Ruby 代码:

h = Hash.new()
desired = "Plan"
$<.read.split(/^\+[-+]+\+$\R/).
    select{|l| l[/\S+/]}.
    map{|l| l.split(/\n/)}.flatten.
    map{|sl| sl.split(/\s*\|\s*/)[1..]}.transpose.
    each{|a| h[a[0]] = a[1..]}
puts h[desired].join("\n")

或者,这是一个 awk 代码:

awk -v d="Plan" '
BEGIN{FS="[[:blank:]]*[|][[:blank:]]*"; idx=0}
FNR==2{for(i=1; i<=NF; i++) if (d==$i) idx=i; next}
NR>1 {print $idx}
' file

无论哪种方式,都会打印出:

PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90

你可以通过将 desired="Plan" 更改为所需列的名称来打印任何其他列。

英文:

Here is a Ruby that allows selecting columns by name of the column:

ruby  -e &#39;
h=Hash.new()
desired=&quot;Plan&quot;
$&lt;.read.split(/^\+[-+]+\+$\R/).
    select{|l| l[/\S+/]}.
    map{|l| l.split(/\n/)}.flatten.
    map{|sl| sl.split(/\s*\|\s*/)[1..]}.transpose.
    each{|a| h[a[0]]=a[1..]}
puts h[desired].join(&quot;\n&quot;)
&#39; file

Or, this awk:

awk -v d=&quot;Plan&quot; &#39;
BEGIN{FS=&quot;[[:blank:]]*[|][[:blank:]]*&quot;; idx=0}
FNR==2{for(i=1; i&lt;=NF; i++) if (d==$i) idx=i; next}
NR&gt;1 {print $idx}
&#39; file

Either prints:

PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90

You can print any other column by changing desired=&quot;Plan&quot; to the desired name at the top of the column.

答案5

得分: 1

我认为这很简单,看看这个:

awk -F "|" 'NR>3 && NR<8 {print $7}' file.txt

或者使用sed和**datamash**的组合来实现另一个不错的解决方案:

sed '1,3d;$d' file.txt | datamash -t '|' cut 7

在sed中:

  • 1,3d;$d:删除文件的第1到第3行(1,3d)和最后一行($d)。

在**datamash**中:

  • -t "|":将分隔符设置为"|"(管道符号)。
  • cut 7:选择并仅显示第7列(计划列)。

输出结果:

PlanA BGP40 
PlanB BGP30 
PlanA BGP10 
Plan BGP90
英文:

I think it's very easy, check this out :

awk -F &quot;|&quot; &#39;NR&gt;3 &amp;&amp; NR&lt;8 {print $7}&#39; file.txt

or another nice solution with the combination of sed and datamash

sed &#39;1,3d;$d&#39; file.txt | datamash -t &#39;|&#39; cut 7

in sed

  • &#39;1,3d;$d&#39;: to delete lines 1 to 3 (1,3d) and the last line ($d) of the file.

in datamash

  • -t &quot;|&quot;: Set the delimiter as "|" (pipe symbol).
  • cut 7: Select and display only the 7th column (Plan column).

output

PlanA BGP40 
PlanB BGP30 
PlanA BGP10 
Plan BGP90

huangapple
  • 本文由 发表于 2023年7月31日 18:21:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76802674.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定