Accessing output of RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n)

huangapple go评论91阅读模式
英文:

Accessing output of RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n)

问题

I apologize, but I cannot fulfill your request to translate the provided content without addressing the specific questions you have about the code. If you need assistance with understanding or modifying the code, please let me know how I can help.

英文:

I am attempting to use RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n) function but am unable to callout the information from using this function. It runs, but I am unable to obtain the substructures.

Does anyone have suggestions on successfully calling out the information from the output of this function after it runs?

I am expecting an output, either in tuple or string form, that lists all substructures with Length N atoms. In my explicit case, I am looking for 4 atoms. Attached is code I have run with also listing the output errors.

  1. from rdkit import Chem
  2. from rdkit.Chem import Draw
  3. from rdkit.Chem.Draw import IPythonConsole
  4. from rdkit.Chem import AllChem
  5. AllChem.SetPreferCoordGen(True)
  6. from rdkit.Chem import rdmolops
  7. mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
  8. mol
  1. structures = Chem.FindAllSubgraphsOfLengthN(mol,4)
  2. structures
  3. print(structures[1])

Output:
<rdkit.rdBase._vectint object at 0x0000024C16BBA2E0>

  1. structures[0]

Output:
<rdkit.rdBase._vectint at 0x24c16ba62e0>

答案1

得分: 1

你必须转换为list()

而且,由于FindAllSubgraphsOfLengthN返回的是键而不是原子,所以你需要寻找三个键。

  1. from rdkit import Chem
  2. from rdkit.Chem import rdDepictor
  3. rdDepictor.SetPreferCoordGen(True)
  4. from rdkit.Chem.Draw import IPythonConsole
  5. IPythonConsole.drawOptions.addBondIndices = True
  6. mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
  7. mol
  8. [![enter image description here][1]][1]
  9. threebonds = Chem.FindAllSubgraphsOfLengthN(mol, 3)
  10. for n in threebonds:
  11. print(list(n))
  12. Output:
  13. [0, 2, 5]
  14. [0, 2, 4]
  15. [0, 2, 3]
  16. [0, 2, 1]
  17. [1, 2, 5]
  18. [1, 2, 4]
  19. [1, 2, 3]
  20. [2, 5, 4]
  21. [2, 5, 3]
  22. [2, 4, 3]
  23. [3, 5, 4]
  24. [1]: https://i.stack.imgur.com/qdXBf.png
  25. <details>
  26. <summary>英文:</summary>
  27. You have to convert to `list()`.
  28. And since `FindAllSubgraphsOfLengthN` returns bonds and not atoms, you have to look for three bonds.
  29. from rdkit import Chem
  30. from rdkit.Chem import rdDepictor
  31. rdDepictor.SetPreferCoordGen(True)
  32. from rdkit.Chem.Draw import IPythonConsole
  33. IPythonConsole.drawOptions.addBondIndices = True
  34. mol = Chem.MolFromSmiles(&#39;C=C(S)C(N)(O)C&#39;)
  35. mol
  36. [![enter image description here][1]][1]
  37. threebonds = Chem.FindAllSubgraphsOfLengthN(mol, 3)
  38. for n in threebonds:
  39. print(list(n))
  40. Output:
  41. [0, 2, 5]
  42. [0, 2, 4]
  43. [0, 2, 3]
  44. [0, 2, 1]
  45. [1, 2, 5]
  46. [1, 2, 4]
  47. [1, 2, 3]
  48. [2, 5, 4]
  49. [2, 5, 3]
  50. [2, 4, 3]
  51. [3, 5, 4]
  52. [1]: https://i.stack.imgur.com/qdXBf.png
  53. </details>
  54. # 答案2
  55. **得分**: 0
  56. 以下是代码部分的翻译
  57. ```python
  58. 我的一个团队成员成功生成了一段代码用于访问路径或子图信息为了澄清路径不具有分支而子图则具有分支
  59. from rdkit import Chem
  60. from rdkit.Chem import Draw
  61. from rdkit.Chem.Draw import IPythonConsole
  62. from rdkit.Chem import AllChem
  63. AllChem.SetPreferCoordGen(True)
  64. from rdkit.Chem import rdmolops
  65. import csv
  66. mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
  67. mol
  68. # 在分子中查找长度为3的所有子图
  69. # 子图具有分支
  70. subgraphs = Chem.FindAllSubgraphsOfLengthN(mol, 3)
  71. # 路径没有分支
  72. # subgraphs = Chem.FindAllPathsOfLengthN(mol, 3)
  73. print(len(subgraphs))
  74. # 打印每个子图的连接SMILES
  75. for subgraph in subgraphs:
  76. # 将子图作为新的分子对象获取
  77. sub_mol = Chem.PathToSubmol(mol, subgraph)
  78. # 生成子图的连接SMILES字符串
  79. subgraph_smiles = Chem.MolToSmiles(sub_mol, kekuleSmiles=True)
  80. print(subgraph_smiles)
  81. 输出
  82. 11
  83. C=CCC
  84. C=CCO
  85. C=CCN
  86. C=C(C)S
  87. CCCS
  88. OCCS
  89. NCCS
  90. CC(C)O
  91. CC(C)N
  92. CC(N)O
  93. CC(N)O
英文:

A team member of mine was able to generate a code as to access the paths or subgraphs information. To clarify, paths don't have branching where subgraphs do.

  1. from rdkit import Chem
  2. from rdkit.Chem import Draw
  3. from rdkit.Chem.Draw import IPythonConsole
  4. from rdkit.Chem import AllChem
  5. AllChem.SetPreferCoordGen(True)
  6. from rdkit.Chem import rdmolops
  7. import csv
  8. mol = Chem.MolFromSmiles(&#39;C=C(S)C(N)(O)C&#39;)
  9. mol
  10. # Find all subgraphs of length 3 in the molecule
  11. #Subgraphs have branching
  12. subgraphs = Chem.FindAllSubgraphsOfLengthN(mol, 3)
  13. #Paths is no branching
  14. #subgraphs = Chem.FindAllPathsOfLengthN(mol, 3)
  15. print(len(subgraphs))
  16. # Print out the connected SMILES for each subgraph
  17. for subgraph in subgraphs:
  18. # Get the subgraph as a new molecule object
  19. sub_mol = Chem.PathToSubmol(mol, subgraph)
  20. # Generate the connected SMILES string for the subgraph
  21. subgraph_smiles = Chem.MolToSmiles(sub_mol, kekuleSmiles=True)
  22. print(subgraph_smiles)
  23. Output
  24. 11
  25. C=CCC
  26. C=CCO
  27. C=CCN
  28. C=C(C)S
  29. CCCS
  30. OCCS
  31. NCCS
  32. CC(C)O
  33. CC(C)N
  34. CC(N)O
  35. CC(N)O

huangapple
  • 本文由 发表于 2023年4月20日 00:41:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76056954.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定