英文:
Matching user sequence in awk
问题
根据正则表达式模式beg_ere
匹配行。用户还可以传递逗号分隔的序列ukeys
,以匹配从正在处理的文件中的匹配行中读取的pkeys
中的值。如果ukeys
中的任何元素与pkeys
中的元素匹配,则显示设置为1(display = 1
)。
我的问题是,当kaggr
中的元素具有前导或尾随空格时,条件(uaggr[i] == kaggr[j])
会失败。
英文:
I am matching line accarding to regexp pattern beg_ere
. A user can also pass
a comma separated sequence ukeys
to match values in pkeys
read from the matched
line in the file being processed. If any elements in ukeys
match elements in
pkeys
, display is set to a value of 1 (display = 1
).
My problem is that when elements in kaggr
have leading or trailing spaces,
the condition (uaggr[i] == kaggr[j])
fails.
match($0, beg_ere, paggr) {
pkeys = paggr[4]
nuk = split(ukeys, uaggr, ",")
npk = split(pkeys, kaggr, ",")
if ( nuk == 0 ) {
display = 1
}
else if ( nuk > 0 && npk > 0 ) {
umatch = 0
for (i in uaggr) {
for (j in kaggr) {
if (uaggr[i] == kaggr[j]) { umatch = 1 ; break }
}
if (umatch == 1) { display = 1 }
}
}
}
答案1
得分: 1
the fieldsep
argument to the split
function can be a regular expression, so you can remove the whitespace while splitting
npk = split(ukeys, uaggr, ",[[:blank:]]*")
demo
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 > b<
3 > c<
but
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",[[:blank:]]*")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 >b<
3 >c<
Alternately, use gsub
to create a "trim" function:
awk '
function trim(s) { gsub(/^[[:blank:]]+|[[:blank:]]+$/, "", s); return s }
BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\t>%s<\n", i, uaggr[i], trim(uaggr[i])
}
'
1 >a< >a<
2 > b< >b<
3 > c< >c<
英文:
the fieldsep
argument to the split
function can be a regular expression, so you can remove the whitespace while splitting
npk = split(ukeys, uaggr, ",[[:blank:]]*")
demo
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 > b<
3 > c<
but
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",[[:blank:]]*")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 >b<
3 >c<
Alternately, use gsub
to create a "trim" function:
awk '
function trim(s) { gsub(/^[[:blank:]]+|[[:blank:]]+$/, "", s); return s }
BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\t>%s<\n", i, uaggr[i], trim(uaggr[i])
}
'
1 >a< >a<
2 > b< >b<
3 > c< >c<
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论