awk – 如何在.pdb文件中添加一行 “TER”

huangapple go评论124阅读模式
英文:

awk - how to add one line "TER" in .pdb file

问题

这是我的.pdb文件片段:

  1. ATOM 73 HG1 GLU 4 77.769 51.123 52.300 1.00 0.00 H
  2. ATOM 74 HG2 GLU 4 78.465 52.119 52.349 1.00 0.00 H
  3. ATOM 75 CD GLU 4 79.068 49.945 51.438 1.00 0.00 C
  4. ATOM 76 OE1 GLU 4 80.069 49.715 50.698 1.00 0.00 O
  5. ATOM 77 OE2 GLU 4 78.545 49.062 52.176 1.00 0.00 O
  6. ATOM 78 C GLU 4 81.179 52.948 53.610 1.00 0.00 C
  7. ATOM 79 O GLU 4 80.203 53.460 54.165 1.00 0.00 O
  8. ATOM 80 N GLU 5 82.590 53.305 53.698 1.00 0.00 N
  9. ATOM 81 HN GLU 5 83.090 53.117 52.847 1.00 0.00 H
  10. ATOM 82 CA GLU 5 83.454 54.267 54.627 1.00 0.00 C
  11. ATOM 83 HA GLU 5 83.749 55.087 53.980 1.00 0.00 H
  12. ATOM 84 CB GLU 5 82.258 54.565 55.220 1.00 0.00 C

我尝试编写一个awk脚本,在第4个氨基酸残基之后直接添加"TER"行(氨基酸残基的编号在第5列附近,接近三字母氨基酸代码)。

我的脚本如下,但它不起作用(它不在所需位置向pdb文件添加新行"TER"):

  1. awk 'NR==5 {print; print "TER"} NR!=5' my_pdb.pdb > pdb-with-ter.pdb

我尝试过类似这样的东西:

  1. awk 'NR==5 {print; print "TER"} NR!=5' my_pdb.pdb > pdb-with-ter.pdb

最后,我希望获得这样的片段:

  1. ATOM 73 HG1 GLU 4 77.769 51.123 52.300 1.00 0.00 H
  2. ATOM 74 HG2 GLU 4 78.465 52.119 52.349 1.00 0.00 H
  3. ATOM 75 CD GLU 4 79.068 49.945 51.438 1.00 0.00 C
  4. ATOM 76 OE1 GLU 4 80.069 49.715 50.698 1.00 0.00 O
  5. ATOM 77 OE2 GLU 4 78.545 49.062 52.176 1.00 0.00 O
  6. ATOM 78 C GLU 4 81.179 52.948 53.610 1.00 0.00 C
  7. ATOM 79 O GLU 4 80.203 53.460 54.165 1.00 0.00 O
  8. TER
  9. ATOM 80 N GLU 5 82.590 53.305 53.698 1.00 0.00 N
  10. ATOM 81 HN GLU 5 83.090 53.117 52.847 1.00 0.00 H
  11. ATOM 82 CA GLU 5 83.454 54.267 54.627 1.00 0.00 C
  12. ATOM 83 HA GLU 5 83.749 55.087 53.980 1.00 0.00 H
  13. ATOM 84 CB GLU 5 82.258 54.565 55.220 1.00 0.00 C
英文:

This is my fragment .pdb file:

  1. ATOM 73 HG1 GLU 4 77.769 51.123 52.300 1.00 0.00 H
  2. ATOM 74 HG2 GLU 4 78.465 52.119 52.349 1.00 0.00 H
  3. ATOM 75 CD GLU 4 79.068 49.945 51.438 1.00 0.00 C
  4. ATOM 76 OE1 GLU 4 80.069 49.715 50.698 1.00 0.00 O
  5. ATOM 77 OE2 GLU 4 78.545 49.062 52.176 1.00 0.00 O
  6. ATOM 78 C GLU 4 81.179 52.948 53.610 1.00 0.00 C
  7. ATOM 79 O GLU 4 80.203 53.460 54.165 1.00 0.00 O
  8. ATOM 80 N GLU 5 82.590 53.305 53.698 1.00 0.00 N
  9. ATOM 81 HN GLU 5 83.090 53.117 52.847 1.00 0.00 H
  10. ATOM 82 CA GLU 5 83.454 54.267 54.627 1.00 0.00 C
  11. ATOM 83 HA GLU 5 83.749 55.087 53.980 1.00 0.00 H
  12. ATOM 84 CB GLU 5 82.258 54.565 55.220 1.00 0.00 C

I try write script in awk, which will add "TER" line directly after 4th aminoacid residue (number of aminoacid residue is given in 5th column, near three letter code of aminoacid).

My script looks like below, but it doesn't work (it doesn't add new line "TER" to pdb file in required space):

  1. awk 'NR==5 {print; print "TER"} NR!=5' my_pdb.pdb > pdb-with-ter.pdb

Could You help me ?

I tried something like this:

  1. awk 'NR==5 {print; print "TER"} NR!=5' my_pdb.pdb > pdb-with-ter.pdb

Finally I want obtain such fragment:

  1. ATOM 73 HG1 GLU 4 77.769 51.123 52.300 1.00 0.00 H
  2. ATOM 74 HG2 GLU 4 78.465 52.119 52.349 1.00 0.00 H
  3. ATOM 75 CD GLU 4 79.068 49.945 51.438 1.00 0.00 C
  4. ATOM 76 OE1 GLU 4 80.069 49.715 50.698 1.00 0.00 O
  5. ATOM 77 OE2 GLU 4 78.545 49.062 52.176 1.00 0.00 O
  6. ATOM 78 C GLU 4 81.179 52.948 53.610 1.00 0.00 C
  7. ATOM 79 O GLU 4 80.203 53.460 54.165 1.00 0.00 O
  8. TER
  9. ATOM 80 N GLU 5 82.590 53.305 53.698 1.00 0.00 N
  10. ATOM 81 HN GLU 5 83.090 53.117 52.847 1.00 0.00 H
  11. ATOM 82 CA GLU 5 83.454 54.267 54.627 1.00 0.00 C
  12. ATOM 83 HA GLU 5 83.749 55.087 53.980 1.00 0.00 H
  13. ATOM 84 CB GLU 5 82.258 54.565 55.220 1.00 0.00 C

答案1

得分: 2

假设:

  • 当第5列的值从4变为5时,添加一行新行(TER)

一个awk的想法:

  1. prev==4 && $5==5 { print "TER" }
  2. { prev = $5 }
  3. 1

或者作为一行命令:

这将生成:

  1. ATOM 73 HG1 GLU 4 77.769 51.123 52.300 1.00 0.00 H
  2. ATOM 74 HG2 GLU 4 78.465 52.119 52.349 1.00 0.00 H
  3. ATOM 75 CD GLU 4 79.068 49.945 51.438 1.00 0.00 C
  4. ATOM 76 OE1 GLU 4 80.069 49.715 50.698 1.00 0.00 O
  5. ATOM 77 OE2 GLU 4 78.545 49.062 52.176 1.00 0.00 O
  6. ATOM 78 C GLU 4 81.179 52.948 53.610 1.00 0.00 C
  7. ATOM 79 O GLU 4 80.203 53.460 54.165 1.00 0.00 O
  8. TER
  9. ATOM 80 N GLU 5 82.590 53.305 53.698 1.00 0.00 N
  10. ATOM 81 HN GLU 5 83.090 53.117 52.847 1.00 0.00 H
  11. ATOM 82 CA GLU 5 83.454 54.267 54.627 1.00 0.00 C
  12. ATOM 83 HA GLU 5 83.749 55.087 53.980 1.00 0.00 H
  13. ATOM 84 CB GLU 5 82.258 54.565 55.220 1.00 0.00 C
英文:

Assumptions:

  • when the 5th column value changes from 4 to 5, add a new line (TER)

One awk idea:

  1. awk '
  2. prev==4 && $5==5 { print "TER" }
  3. { prev = $5 }
  4. 1
  5. ' my_pdb.pdb
  6. # or as a one-liner
  7. awk 'prev=="4" && $5=="5" {print "TER"} {prev=$5} 1' my_pdb.pdb

This generates:

  1. ATOM 73 HG1 GLU 4 77.769 51.123 52.300 1.00 0.00 H
  2. ATOM 74 HG2 GLU 4 78.465 52.119 52.349 1.00 0.00 H
  3. ATOM 75 CD GLU 4 79.068 49.945 51.438 1.00 0.00 C
  4. ATOM 76 OE1 GLU 4 80.069 49.715 50.698 1.00 0.00 O
  5. ATOM 77 OE2 GLU 4 78.545 49.062 52.176 1.00 0.00 O
  6. ATOM 78 C GLU 4 81.179 52.948 53.610 1.00 0.00 C
  7. ATOM 79 O GLU 4 80.203 53.460 54.165 1.00 0.00 O
  8. TER
  9. ATOM 80 N GLU 5 82.590 53.305 53.698 1.00 0.00 N
  10. ATOM 81 HN GLU 5 83.090 53.117 52.847 1.00 0.00 H
  11. ATOM 82 CA GLU 5 83.454 54.267 54.627 1.00 0.00 C
  12. ATOM 83 HA GLU 5 83.749 55.087 53.980 1.00 0.00 H
  13. ATOM 84 CB GLU 5 82.258 54.565 55.220 1.00 0.00 C

huangapple
  • 本文由 发表于 2023年7月10日 18:59:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76653047.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定