英文:
awk how to print one column only but other columns have more than one word
问题
这是一个示例数据:
+-------+--------------------+-----------+------------+-----------+-------------+
| ID | Name | Status | Networks | Image | Plan |
+-------+--------------------+-----------+------------+-----------+-------------+
| 1wsd | HostName A | PAUSED | IP=1.1.1.1 | Ubuntu20 | PlanA BGP40 |
| 4fgh | An other hostname | ACTIVE | IP=2.2.2.2 | Ubuntu20 | PlanB BGP30 |
| zxd1 | final.destination | REBOOTING | IP=3.3.3.3 | Debian11 | PlanA BGP10 |
| 60hn | no problem | ACTIVE | IP=4.4.4.4 | Centos7 | Plan BGP90 |
+-------+--------------------+-----------+------------+-----------+-------------+
我想要打印出Plan
列,但是你可以看到,它的列号是不固定的。例如对于1.1.1.1
,计划名称是从11
到行尾的(假设是从行尾开始,但我们可以删除行尾的|
)。
首先,它应该只过滤出Plan
列(格式类似于表格),我们可以排除头部的前三行和尾部的最后一行,这样我们只能得到计划名称。
期望的输出只有计划名称:
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
我正在搜索,但到目前为止还没有找到解决方法。
英文:
This is a sample data:
+-------+--------------------+-----------+------------+-----------+-------------+
| ID | Name | Status | Networks | Image | Plan |
+-------+--------------------+-----------+------------+-----------+-------------+
| 1wsd | HostName A | PAUSED | IP=1.1.1.1 | Ubuntu20 | PlanA BGP40 |
| 4fgh | An other hostname | ACTIVE | IP=2.2.2.2 | Ubuntu20 | PlanB BGP30 |
| zxd1 | final.destination | REBOOTING | IP=3.3.3.3 | Debian11 | PlanA BGP10 |
| 60hn | no problem | ACTIVE | IP=4.4.4.4 | Centos7 | Plan BGP90 |
+-------+--------------------+-----------+------------+-----------+-------------+
I want to print only Plan
column only, but as you see, it's not a fixed column number. For example for 1.1.1.1
, the plan name is in the from 11
to the end (let's suppose from the end, but we can remove the |
at the end of line).
First, it should filter only the Plan
column (the format is like this and is table-like), and we can exclude the first three lines of header and the one last line of the tail so that we can only have the plan names.
Expected output is only plan names:
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
I'm googling but didn't find a way up to now.
答案1
得分: 4
你可以使用以下命令:
$ awk -F '|' 'NR>4 {gsub(/^[[:space:]]+|[[:space:]]+$/,"",v); print v} {v=$(NF-1)}' file
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
- 将字段分隔符设置为
|
(-F '|'
) - 跳过前三行和最后一行。详细信息请参考这里
- 创建一个变量
v
,它保存倒数第二个条目 (v=$(NF-1)
),然后从中删除所有前导和尾随空格 (gsub(/^[[:space:]]+|[[:space:]]+$/,"",v)
)
英文:
You can use:
$ awk -F '|' 'NR>4 {gsub(/^[[:space:]]+|[[:space:]]+$/,"",v); print v} {v=$(NF-1)}' file
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
- Set field separator to
|
(-F '|'
) - Skip the first three lines and the last one. See here for how this works
- Create a variable
v
which holds the second to last entry (v=$(NF-1)
), then remove all leading and trailing whitespace from it (gsub(/^[[:space:]]+|[[:space:]]+$/,"",v)
)
答案2
得分: 1
如果你对 [tag:sed] 的替代方案感兴趣
sed '
1,3d # 删除前三行
$d # 删除最后一行
s/[[:blank:]]*|[[:blank:]]*$// # 移除尾部边界
s/.*|[[:blank:]]*// # 消耗所有前导单元格。
' file
在 sed 的基本正则表达式中,|
是一个普通字符。
英文:
If you're interested in a [tag:sed] alternative
sed '
1,3d # delete the first 3 lines
$d # delete the last line
s/[[:blank:]]*|[[:blank:]]*$// # remove the trailing border
s/.*|[[:blank:]]*// # consume all the leading cells.
' file
With sed's basic regular expressions, |
is a plain character.
答案3
得分: 1
使用 sed
简单地进行处理:
- 首先删除最后一个
空格 + 管道符
:|
- 然后删除所有直到
管道符 + 空格
的内容:.*|
- 从第三行开始操作到结尾
所以命令将会是:
sed -ne '3,${s/ |[^|]*$//;s/^.*| //p;}'
或者
sed -ne '
3,${
s/ |[^|]*$//;
s/^.*| //p;
}
'
但是如果字段必须以 plan
开头,事情就变得简单了:
sed -ne 's/.*\(Plan.*[^[:space:]]\)[[:space:]]*|[^|]*$//p'
英文:
using sed
simply:
- 1st remove last
space + pipe
:|
- Then remove all untils
pipe + space
:.*|
- Oper from line 3 to end
So command will be just:
sed -ne '3,${s/ |[^|]*$//;s/^.*| //p;}'
Or
sed -ne '
3,${
s/ |[^|]*$//;
s/^.*| //p;
}
'
But if the field have to begin by plan
, things become simplier:
sed -ne 's/.*\(Plan.*[^[:space:]]\)[[:space:]]*|[^|]*$//p'
答案4
得分: 1
这是一个允许按列名选择列的 Ruby 代码:
h = Hash.new()
desired = "Plan"
$<.read.split(/^\+[-+]+\+$\R/).
select{|l| l[/\S+/]}.
map{|l| l.split(/\n/)}.flatten.
map{|sl| sl.split(/\s*\|\s*/)[1..]}.transpose.
each{|a| h[a[0]] = a[1..]}
puts h[desired].join("\n")
或者,这是一个 awk 代码:
awk -v d="Plan" '
BEGIN{FS="[[:blank:]]*[|][[:blank:]]*"; idx=0}
FNR==2{for(i=1; i<=NF; i++) if (d==$i) idx=i; next}
NR>1 {print $idx}
' file
无论哪种方式,都会打印出:
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
你可以通过将 desired="Plan"
更改为所需列的名称来打印任何其他列。
英文:
Here is a Ruby that allows selecting columns by name of the column:
ruby -e '
h=Hash.new()
desired="Plan"
$<.read.split(/^\+[-+]+\+$\R/).
select{|l| l[/\S+/]}.
map{|l| l.split(/\n/)}.flatten.
map{|sl| sl.split(/\s*\|\s*/)[1..]}.transpose.
each{|a| h[a[0]]=a[1..]}
puts h[desired].join("\n")
' file
Or, this awk:
awk -v d="Plan" '
BEGIN{FS="[[:blank:]]*[|][[:blank:]]*"; idx=0}
FNR==2{for(i=1; i<=NF; i++) if (d==$i) idx=i; next}
NR>1 {print $idx}
' file
Either prints:
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
You can print any other column by changing desired="Plan"
to the desired name at the top of the column.
答案5
得分: 1
我认为这很简单,看看这个:
awk -F "|" 'NR>3 && NR<8 {print $7}' file.txt
或者使用sed和**datamash**的组合来实现另一个不错的解决方案:
sed '1,3d;$d' file.txt | datamash -t '|' cut 7
在sed中:
1,3d;$d
:删除文件的第1到第3行(1,3d
)和最后一行($d
)。
在**datamash**中:
-t "|"
:将分隔符设置为"|"(管道符号)。cut 7
:选择并仅显示第7列(计划列)。
输出结果:
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
英文:
I think it's very easy, check this out :
awk -F "|" 'NR>3 && NR<8 {print $7}' file.txt
or another nice solution with the combination of sed and datamash
sed '1,3d;$d' file.txt | datamash -t '|' cut 7
in sed
'1,3d;$d'
: to delete lines 1 to 3 (1,3d
) and the last line ($d
) of the file.
in datamash
-t "|"
: Set the delimiter as "|" (pipe symbol).cut 7
: Select and display only the 7th column (Plan column).
output
PlanA BGP40
PlanB BGP30
PlanA BGP10
Plan BGP90
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论