英文:
Matching user sequence in awk
问题
根据正则表达式模式beg_ere匹配行。用户还可以传递逗号分隔的序列ukeys,以匹配从正在处理的文件中的匹配行中读取的pkeys中的值。如果ukeys中的任何元素与pkeys中的元素匹配,则显示设置为1(display = 1)。
我的问题是,当kaggr中的元素具有前导或尾随空格时,条件(uaggr[i] == kaggr[j])会失败。
英文:
I am matching line accarding to regexp pattern beg_ere. A user can also pass
a comma separated sequence ukeys to match values in pkeys read from the matched
line in the file being processed. If any elements in ukeys match elements in
pkeys, display is set to a value of 1 (display = 1).
My problem is that when elements in kaggr have leading or trailing spaces,
the condition (uaggr[i] == kaggr[j]) fails.
match($0, beg_ere, paggr) {
pkeys = paggr[4]
nuk = split(ukeys, uaggr, ",")
npk = split(pkeys, kaggr, ",")
if ( nuk == 0 ) {
display = 1
}
else if ( nuk > 0 && npk > 0 ) {
umatch = 0
for (i in uaggr) {
for (j in kaggr) {
if (uaggr[i] == kaggr[j]) { umatch = 1 ; break }
}
if (umatch == 1) { display = 1 }
}
}
}
答案1
得分: 1
the fieldsep argument to the split function can be a regular expression, so you can remove the whitespace while splitting
npk = split(ukeys, uaggr, ",[[:blank:]]*")
demo
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 > b<
3 > c<
but
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",[[:blank:]]*")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 >b<
3 >c<
Alternately, use gsub to create a "trim" function:
awk '
function trim(s) { gsub(/^[[:blank:]]+|[[:blank:]]+$/, "", s); return s }
BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\t>%s<\n", i, uaggr[i], trim(uaggr[i])
}
'
1 >a< >a<
2 > b< >b<
3 > c< >c<
英文:
the fieldsep argument to the split function can be a regular expression, so you can remove the whitespace while splitting
npk = split(ukeys, uaggr, ",[[:blank:]]*")
demo
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 > b<
3 > c<
but
awk 'BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",[[:blank:]]*")
for (i=1; i <= npk; i++) printf "%d\t>%s<\n", i, uaggr[i]
}'
1 >a<
2 >b<
3 >c<
Alternately, use gsub to create a "trim" function:
awk '
function trim(s) { gsub(/^[[:blank:]]+|[[:blank:]]+$/, "", s); return s }
BEGIN {
ukeys = "a, b, c"
npk = split(ukeys, uaggr, ",")
for (i=1; i <= npk; i++) printf "%d\t>%s<\t>%s<\n", i, uaggr[i], trim(uaggr[i])
}
'
1 >a< >a<
2 > b< >b<
3 > c< >c<
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论