将awk用于根据字符分隔bed文件中的行。

huangapple go评论54阅读模式
英文:

awk to separate rows from bed files depending on character

问题

我想将一列中的行使用逗号分隔符分开,并保留行的其他信息。我有一个包含4列和大量行的制表符分隔文件...

从这里:

1 13445 rs558318514 C G,T 1_13445

1 13453 rs568927457 T C 1_13455

1 13483 rs554760071 G A,C 1_13483

1 13550 rs554008981 G A 1_13550

到这里:

1 13445 rs558318514 C G 1_13445

1 13445 rs558318514 C T 1_13445

1 13453 rs568927457 T C 1_13453

1 13483 rs554760071 G A 1_13483

1 13483 rs554760071 G C 1_13483

1 13550 rs554008981 G A 1_13550

英文:

I want to separate rows by comma delimiter in one filed and keep the other information of the row. I have tab delimited files with 4 columns and a lot of rows...

Frome here:

1 13445	rs558318514	C	G,T	1_13445

1 13453	rs568927457	T	C	1_13455

1 13483	rs554760071	G	A,C	1_13483

1 13550	rs554008981	G	A	1_13550

To here:

1	13445	rs558318514	C	G	1_13445

1	13445	rs558318514	C	T	1_13445

1	13453	rs568927457	T	C	1_13453

1	13483	rs554760071	G	A	1_13483

1	13483	rs554760071	G	C	1_13483

1	13550	rs554008981	G	A	1_13550

答案1

得分: 1

Sure, here is the translated code portion:

$ awk '{n=split($5,a,","); for(i=1; i<=n; i++){$5=a[i]; print}}' file
1 13445 rs558318514 C G 1_13445
1 13445 rs558318514 C T 1_13445
1 13453 rs568927457 T C 1_13455
1 13483 rs554760071 G A 1_13483
1 13483 rs554760071 G C 1_13483
1 13550 rs554008981 G A 1_13550
英文:
$ awk &#39;{n=split($5,a,&quot;,&quot;); for(i=1; i&lt;=n; i++){$5=a[i]; print}}&#39; file
1 13445 rs558318514 C G 1_13445
1 13445 rs558318514 C T 1_13445
1 13453 rs568927457 T C 1_13455
1 13483 rs554760071 G A 1_13483
1 13483 rs554760071 G C 1_13483
1 13550 rs554008981 G A 1_13550

huangapple
  • 本文由 发表于 2023年5月11日 16:53:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76225798.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定