英文:
awk: ignore the comment in file
问题
我正在做一个 awk 作业
并获得文件 IN-arg
# 这是一个注释
joejoejoe 10 20 30 # 这也是一个注释
OAK -999 10 # 10000
joeJOE 2000
oak 10
milk 1000 # 2000
我想忽略文件 IN-arg 中的注释
并获得类似于以下内容
joejoejoe 10 20 30
OAK -999 10
joeJOE 2000
oak 10
milk 1000
如何做到这一点
以下是我的代码
#! /bin/awk -f
{
for(i=1;i<=NF;i++)
{
if($i == "#")
i = NF + 1
print $i
}
}
英文:
i'm doing a awk homework
and get the file IN-arg
# this is a comment
joejoejoe 10 20 30 #this is also a comment
OAK -999 10 # 10000
joeJOE 2000
oak 10
milk 1000 # 2000
and I want to ignore the comment in the file IN-arg
and get something like this
joejoejoe 10 20 30
OAK -999 10
joeJOE 2000
oak 10
milk 1000
how can i do it
below is my code
#! /bin/awk -f
{
for(i=1;i<=NF;i++)
{
if($i == "#")
i = NF + 1
print $i
}
答案1
得分: 2
将字段分隔符更改为#
,并在第一个字段$1
不为空时打印它。
请尝试以下内容:
#!/bin/awk -f
BEGIN { FS="#" }
$1 ~ /./ { print $1 }
英文:
What about changing the field separator to #
and printing $1
(first field) when it is not empty?
Give this a try:
#!/bin/awk -f
BEGIN { FS="#" }
$1 ~ /./ { print $1 }
答案2
得分: 1
使用任何POSIX awk:
$ awk '$1 !~ /^#/{sub(/[[:space:]]*#.*/,""); print}' 'IN-arg'
joejoejoe 10 20 30
OAK -999 10
joeJOE 2000
oak 10
milk 1000
我假设你不希望在从行尾删除注释时保留尾随的空白,但如果你希望保留尾随的空白,请从命令中删除[[:space:]]*
。
英文:
Using any POSIX awk:
$ awk '$1 !~ /^#/{sub(/[[:space:]]*#.*/,""); print}' 'IN-arg'
joejoejoe 10 20 30
OAK -999 10
joeJOE 2000
oak 10
milk 1000
I'm assuming above that you don't really want to leave trailing white space when comments are removed from the end of lines but if you do then remove [[:space:]]*
from the command.
答案3
得分: 0
1 joejoejoe 10 20 30
2 OAK -999 10
3 joeJOE 2000
4 oak 10
5 milk 1000
英文:
echo '
# this is a comment
joejoejoe 10 20 30 #this is also a comment
OAK -999 10 # 10000
joeJOE 2000
oak 10
milk 1000 # 2000' |
awk 'NF = $1 !~ "^[ \t-\r]*$"' FS='#.*'
1 joejoejoe 10 20 30
2 OAK -999 10
3 joeJOE 2000
4 oak 10
5 milk 1000
答案4
得分: 0
$ awk '!/#/ || !/^ *#/ && gsub(/ *#.*$/,"")' file
joejoejoe 10 20 30
OAK -999 10
joeJOE 2000
oak 10
milk 1000
英文:
$ awk '!/#/ || !/^ *#/ && gsub(/ *#.*$/,"")' file
joejoejoe 10 20 30
OAK -999 10
joeJOE 2000
oak 10
milk 1000
答案5
得分: 0
如果您能选择GNU Awk来完成这个任务,它支持在RS
(记录分隔符变量)中使用正则表达式,这是一种很好的方法:
$ echo 'line1 # comment
line2 # comment
# line3
line4 #comment
line5' | awk -v RS='(#[^\\n]*)?\\n' '{ printf("rec[%d] = %s\n", NR, $0) }'
rec[1] = line1
rec[2] = line2
rec[3] =
rec[4] = line4
rec[5] = line5
我们在记录分隔符中包含了行末注释,因此在GNU Awk将输入分隔成记录时,它会消失。
我们的记录分隔符是"可选的井号注释,后跟换行符",其中可选的井号注释是"井号,后跟任意数量的非换行字符"。
POSIX规定,如果给RS
赋一个长度超过一个字符的值,其行为是未指定的;POSIX不支持在RS
中使用正则表达式。
英文:
If you're able to choose GNU Awk for the task, which supports a regular expression in the RS
(record separator variable), a nice way to do this is:
$ echo 'line1 # comment
line2 # comment
# line3
line4 #comment
line5' | awk -v RS='(#[^\\n]*)?\\n' '{ printf("rec[%d] = %s\n", NR, $0) }'
rec[1] = line1
rec[2] = line2
rec[3] =
rec[4] = line4
rec[5] = line5
We include the end-of-line comment in the record separator, so it disappears at the low level, when GNU Awk is delimiting the input into records.
Our record separator is "optional hash comment, followed by newline", where the optional hash comment is "hash mark, followed by any number of non-newline characters".
POSIX says that the behavior is unspecified if RS
is given a value that is more than one character long; POSIX doesn't support regular expressions in RS
.
答案6
得分: 0
#!/bin/awk -f
{
for(i=1; i<=NF; i++)
{
if($i == "#")
i = NF + 1
print $i
}
}
这段代码不会按预期工作,因为它会在每个字段后都输出换行符,您可以使用 printf
来避免这种情况。另外,由于您允许类似 #this is also a comment
的注释,所以 #
不必作为单独的字段出现,同时使用默认字段分隔符(一个或多个空白字符)。如果您想将每个字符作为字段,可以将 FPAT
内置变量设置为.
。您还可以使用 break
来结束循环,而不必更改用于条件的变量。还需要忽略只包含注释的行,或者换句话说,只为不以空白字符和 #
开头的行打印内容。
应用这些更改后,您的代码将如下所示:
#!/usr/awk -f
BEGIN{FPAT="."}
!/^[[:space:]]*#/{
for(i=1; i<=NF; i++)
{
if($i == "#")
break
printf "%s",$i
}
print ""
}
(在 GNU Awk 5.1.0 中测试通过)
<details>
<summary>英文:</summary>
#! /bin/awk -f
{
for(i=1;i<=NF;i++)
{
if($i == "#")
i = NF + 1
print $i
}
}
This would not work as intended as you `print` each field, you would get newline after each field, you can use `printf` to avoid that, also as you allows comment like `#this is also a comment` then `#` does not to have be separate field whilst using default field separator of one-or-more white-space characters. If you want to have each character as field set `FPAT` built-in variable to `.`, you might also use `break` to end loop rather than tinkering with variable used in condition. You also need to ignore line where there is only comment - or in other words only print anything for lines which are not starting with white-space characters and `#`.
After applying these changes your code would become
#!/bin/awk -f
BEGIN{FPAT="."}
!/^[[:space:]]*#/{
for(i=1;i<=NF;i++)
{
if($i == "#")
break
printf "%s",$i
}
print ""
}
*(tested in GNU Awk 5.1.0)*
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论