为每个空行或匹配递增字段而使用sed。

huangapple go评论43阅读模式
英文:

sed for each empty line or match increment field

问题

以下是翻译好的部分:

示例:由软件生成的二进制数据

    标签 1:“AAA”
    标签 2:“BBB”
    标签 3:“CCC”

    标签 1:“XXX”
    标签 2:“YYY”
    标签 3:“ZZZ”

每个标签“组”由换行符或以“标签 1”开头分隔,可以有“n”个组(因此需要在期望的输出中获取“lab,lab2,lab3”等等)。

当前输出:

    lab,标签 1,AAA,
    lab,标签 2,BBB,
    lab,标签 3,CCC,
    lab,标签 1,XXX,
    lab,标签 2,YYY,
    lab,标签 3,ZZZ,

现有代码:

    labels="$(${binary} -list | sed -e '/^$/d')"
    echo "$labels" | sed -e 's/: \{1,\}/,/g' -e 's/"//g' -e 's/, /,/g' -e "s|^|lab,|g" -e 's/$/,/g'

期望的输出:

    lab,标签 1,AAA,
    lab,标签 2,BBB,
    lab,标签 3,CCC,
    lab2,标签 1,XXX,
    lab2,标签 2,YYY,
    lab2,标签 3,ZZZ,
英文:

Example of data generated by software binary:

Label 1: "AAA"
Label 2: "BBB"
Label 3: "CCC"

Label 1: "XXX"
Label 2: "YYY"
Label 3: "ZZZ"

Each label "group" is separated by new line or starting with "Label 1", can have n groups (so need to get lab, lab2, lab3 in desired output and so on).

Current output:

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab,Label 1,XXX,
lab,Label 2,YYY,
lab,Label 3,ZZZ,

Existing code:

labels="$(${binary} -list | sed -e '/^$/d')"
echo "$labels" | sed -e 's/: \{1,\}/,/g' -e 's/"//g' -e 's/, /,/g' -e "s|^|lab,|g" -e 's/$/,/g'

Desired output:

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

答案1

得分: 2

由于空行是记录分隔符,在awk中可以使用空的RS来执行此操作的方法如下:

awk -v RS= '{
   gsub(/(^|\n)/, "&lab" (NR>1?NR:"") ",");
   gsub(/(: )?"/, ",");
} 1' file

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

英文:

Since an empty line is record separator, here is a way to do this in awk using empty RS:

awk -v RS= '{
   gsub(/(^|\n)/, "&lab" (NR>1?NR:"") ","); gsub(/(: )?"/, ",")
} 1' file

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

答案2

得分: 1

awk在这里可能是一个更好的选择:

awk -F': "|"' -v OFS=, '$1=="Label 1" {p="lab"n; n+=n?1:2} /./{print p,$1,$2,""}'

将输入字段分隔符声明为': "|"',将输出字段分隔符声明为逗号(-v OFS=,)。如果当前行的第一个字段是"Label 1",则将变量p设置为"lab"和变量n的值的连接,如果n已定义,则将其增加1,否则增加2。最后,如果当前行不为空(/./),则打印p、第一个和第二个字段以及一个空的最后字段(用于尾随逗号),它们之间用OFS分隔。

注意:未初始化的变量(如n)在评估上下文中会被视为空字符串或数值0,取决于评估的上下文。在这里,在p="lab"n中,n的评估上下文是字符串连接。因此,第一次n被评估为空字符串,p的值为"lab"。其他情况下,n的值为2、3、4...,而p的值为"lab2""lab3""lab4"...

英文:

awk is probably a better choice here:

awk -F': "|"' -v OFS=, '$1=="Label 1" {p="lab"n; n+=n?1:2} /./{print p,$1,$2,""}'

Declare the input field separator as either : " or " (-F': "|"') and the output field separator as a comma (-v OFS=,). If the first field of the current line is "Label 1", set variable p to concatenation of "lab" and value of variable n, if n is defined increment it by 1, else by 2. Finally, if the current line is not empty (/./) print p, the first and second fields, and an empty last field (for the trailing comma), separated by OFS.

Note: uninitialized variables (like n) evaluate as the empty string or numeric value 0, depending on the evaluation context. Here, in p="lab"n, the evaluation context of n is string concatenation. So the first time n is evaluated as empty string and p takes value "lab". The other times n has value 2, 3, 4... and p takes values "lab2", "lab3", "lab4"...

答案3

得分: 1

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

英文:

Using any awk:

$ awk -v OFS=',' '
    NF { gsub(/(: )?"/,OFS); print "lab" n, $0; next }
    { n += (n ? 1 : 2) }
' file
lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

答案4

得分: 1

使用您提供的示例,请尝试以下awk代码。

awk -v OFS="," '
!NF{
  if(initCount==""){ initCount=2 }
  else             { initCount++ }
  next
}
{
  gsub(/: /,",")
  sub(/"$/,",")
  print "lab"initCount,$0
}
' Input_file
英文:

With your shown samples please try following awk code.

awk -v OFS="," '
!NF{
  if(initCount==""){ initCount=2 }
  else             { initCount++ }
  next
}
{
  gsub(/: "/,",")
  sub(/"$/,",")
  print "lab"initCount,$0
}
'   Input_file

答案5

得分: 0

我将首先将您的代码转换为GNU AWK,然后进行更改以使其按预期工作,让file.txt的内容如下:

Label 1: "AAA"
Label 2: "BBB"
Label 3: "CCC"

Label 1: "XXX"
Label 2: "YYY"
Label 3: "ZZZ"

然后,以下代码将给出输出:

awk '!/^$/{gsub(/: +/," ");gsub(/"/,"");gsub(/, /,",");gsub(/^/,"lab,"(cnt>1?cnt:"")",");gsub(/$/,",");print}' file.txt

输出结果如下:

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

说明:如果在行中存在Label 1:,则将cnt增加1;如果尚未设置,则假设为0,然后增加。对于非空行(!/^$/),执行替换并打印输出。对于第4个gsub,如果cnt大于1,则使用lab后跟cnt,否则为空字符串,然后加上,

(在GNU Awk 5.1.0中测试通过)

英文:

I will first transmute your code into GNU AWK and then apply changes to make it work as intended, let file.txt content be

Label 1: "AAA"
Label 2: "BBB"
Label 3: "CCC"

Label 1: "XXX"
Label 2: "YYY"
Label 3: "ZZZ"

then

awk '!/^$/{gsub(/: +/,",");gsub(/"/,"");gsub(/, /,",");gsub(/^/,"lab,");gsub(/$/,",");print}' file.txt

gives output

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab,Label 1,XXX,
lab,Label 2,YYY,
lab,Label 3,ZZZ,

note that I kept all substitution global, even though ^ and $ might give at most 1 substitution each.

We need counter, which would increase when Label 1: is in line and which could be used during replacement, this can be done following way

awk '/Label 1:/{cnt+=1}!/^$/{gsub(/: +/,",");gsub(/"/,"");gsub(/, /,",");gsub(/^/,"lab" (cnt>1?cnt:"") ",");gsub(/$/,",");print}' file.txt

gives output

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

Explanation: If there is Label 1: in line increase cnt by 1, if it is not yet set assume 0 and then increase, for not (!) empty line (/^$/) execute substitutions and print, for 4th gsub use lab followed by cnt if cnt above 1 else empty string followed by ,.

(tested in GNU Awk 5.1.0)

答案6

得分: 0

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

英文:
 echo '
Label 1: "AAA"
Label 2: "BBB"
Label 3: "CCC"

Label 1: "XXX"
Label 2: "YYY"
Label 3: "ZZZ"' | 

mawk 'NF ? $1 = (__)_ OFS $1 : (_+=!_)<_++' FS='(: )?"' OFS=, __='lab'

lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

答案7

得分: 0

Here is the translated content from your provided code:

$ awk -F': *|"' '
  /^Label 1:/{i++} 
  !/^$/{printf("lab%s,%s,%s,\n", (i==1 ? "" : i), $1, $2)}
' file
lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

$ awk -F': *|"' -v OFS="," '
  /^Label 1:/{i++} 
  !/^$/{$1=$1; print (i==1 ? "lab" : "lab"i), $0}
' file
lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

I've provided the translated code without any additional content.

英文:
$ awk -F': *"|"' '
  /^Label 1:/{i++} 
  !/^$/{printf("lab%s,%s,%s,\n", (i==1 ? "" : i), $1, $2)}
' file
lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

$ awk -F': *"|"' -v OFS="," '
  /^Label 1:/{i++} 
  !/^$/{$1=$1; print (i==1 ? "lab" : "lab"i), $0}
' file
lab,Label 1,AAA,
lab,Label 2,BBB,
lab,Label 3,CCC,
lab2,Label 1,XXX,
lab2,Label 2,YYY,
lab2,Label 3,ZZZ,

答案8

得分: 0

This might work for you (GNU sed):

sed -E ':a;$!{N;/\n$/!ba}
        y/" /,/;s/: |\n$//g;s/^/lab%,/mg;G
        :b;s/lab%(.*)\n(.*)/lab$((+1))\n/;tb
        s/(.*)\n.*/echo ""/e;s/^lab1,/lab,/mg
        x;s/.*/echo $((&+1))/e;x' file
Gather up groups of labels.

Translate `"`'s to commas.

Remove `: `'s and the empty line. Prepend `lab%` to each line and then append the hold space to the current batch of labels.

Replace each occurrence of the introduced `%` by a shell computation using the value stored in the hold space.

Replace the pattern space by an echo command that replaces the shell computations by an actual label number.

For the first set of labels, remove the actual label number, i.e., remove `1`.

Prepare the hold space for the next batch of labels.

Print the result.

Alternative:

```bash
sed -E '/\S/{=;G;s/(.*)\n(.*)/echo "s#^#lab$((+1)),#"/e;b}
        x;s/.*/echo "$((&+1))"/e;x;=;s/.*/d/' file |
sed 'N;s/\n//' |
sed -f - -e 's/^lab1,/lab,/;s/: //;y/" /,/' file

Generate a sed script that amends/deletes each line of the input file.

英文:

This might work for you (GNU sed):

sed -E ':a;$!{N;/\n$/!ba}
        y/"/,/;s/: |\n$//g;s/^/lab%,/mg;G
        :b;s/lab%(.*)\n(.*)/lab$((+1))\n/;tb
        s/(.*)\n.*/echo ""/e;s/^lab1,/lab,/mg
        x;s/.*/echo $((&+1))/e;x' file

Gather up groups of labels.

Translate "'s to commas.

Remove : 's and the empty line. Prepend lab% to each line and then append the hold space to the current batch of labels.

Replace each occurrence of the introduced % by a shell computation using the value stored in the hold space.

Replace the pattern space by an echo command that replaces the shell computations by an actual label number.

For the first set of labels remove the actual label number i.e. remove 1.

Prepare the hold space for the next batch of labels.

Print the result.

Alternative:

sed -E '/\S/{=;G;s/(.*)\n(.*)/echo "s#^#lab$((+1)),#"/e;b}
        x;s/.*/echo "$((&+1))"/e;x;=;s/.*/d/' file |
sed 'N;s/\n//' |
sed -f - -e 's/^lab1,/lab,/;s/: //;y/"/,/' file

Generate a sed script that amends/deletes each line of the input file.

huangapple
  • 本文由 发表于 2023年7月24日 19:40:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76754121.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定