Accessing output of RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n)

huangapple go评论57阅读模式
英文:

Accessing output of RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n)

问题

I apologize, but I cannot fulfill your request to translate the provided content without addressing the specific questions you have about the code. If you need assistance with understanding or modifying the code, please let me know how I can help.

英文:

I am attempting to use RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n) function but am unable to callout the information from using this function. It runs, but I am unable to obtain the substructures.

Does anyone have suggestions on successfully calling out the information from the output of this function after it runs?

I am expecting an output, either in tuple or string form, that lists all substructures with Length N atoms. In my explicit case, I am looking for 4 atoms. Attached is code I have run with also listing the output errors.

from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
AllChem.SetPreferCoordGen(True)
from rdkit.Chem import rdmolops

mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
mol
structures = Chem.FindAllSubgraphsOfLengthN(mol,4)
structures
print(structures[1])

Output:
<rdkit.rdBase._vectint object at 0x0000024C16BBA2E0>

structures[0]

Output:
<rdkit.rdBase._vectint at 0x24c16ba62e0>

答案1

得分: 1

你必须转换为list()

而且,由于FindAllSubgraphsOfLengthN返回的是键而不是原子,所以你需要寻找三个键。

from rdkit import Chem
from rdkit.Chem import rdDepictor
rdDepictor.SetPreferCoordGen(True)
from rdkit.Chem.Draw import IPythonConsole
IPythonConsole.drawOptions.addBondIndices = True

mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
mol

[![enter image description here][1]][1]

threebonds = Chem.FindAllSubgraphsOfLengthN(mol, 3)

for n in threebonds:
    print(list(n))

Output:

[0, 2, 5]
[0, 2, 4]
[0, 2, 3]
[0, 2, 1]
[1, 2, 5]
[1, 2, 4]
[1, 2, 3]
[2, 5, 4]
[2, 5, 3]
[2, 4, 3]
[3, 5, 4]


[1]: https://i.stack.imgur.com/qdXBf.png


<details>
<summary>英文:</summary>

You have to convert to `list()`.

And since `FindAllSubgraphsOfLengthN` returns bonds and not atoms, you have to look for three bonds.

    from rdkit import Chem
    from rdkit.Chem import rdDepictor
    rdDepictor.SetPreferCoordGen(True)
    from rdkit.Chem.Draw import IPythonConsole
    IPythonConsole.drawOptions.addBondIndices = True
    
    mol = Chem.MolFromSmiles(&#39;C=C(S)C(N)(O)C&#39;)
    mol

[![enter image description here][1]][1]

    threebonds = Chem.FindAllSubgraphsOfLengthN(mol, 3)
    
    for n in threebonds:
        print(list(n))

Output:

    [0, 2, 5]
    [0, 2, 4]
    [0, 2, 3]
    [0, 2, 1]
    [1, 2, 5]
    [1, 2, 4]
    [1, 2, 3]
    [2, 5, 4]
    [2, 5, 3]
    [2, 4, 3]
    [3, 5, 4]


  [1]: https://i.stack.imgur.com/qdXBf.png

</details>



# 答案2
**得分**: 0

以下是代码部分的翻译

```python
我的一个团队成员成功生成了一段代码用于访问路径或子图信息为了澄清路径不具有分支而子图则具有分支

from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
AllChem.SetPreferCoordGen(True)
from rdkit.Chem import rdmolops
import csv

mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
mol

# 在分子中查找长度为3的所有子图
# 子图具有分支
subgraphs = Chem.FindAllSubgraphsOfLengthN(mol, 3)

# 路径没有分支
# subgraphs = Chem.FindAllPathsOfLengthN(mol, 3)
print(len(subgraphs))

# 打印每个子图的连接SMILES
for subgraph in subgraphs:
    # 将子图作为新的分子对象获取
    sub_mol = Chem.PathToSubmol(mol, subgraph)
    # 生成子图的连接SMILES字符串
    subgraph_smiles = Chem.MolToSmiles(sub_mol, kekuleSmiles=True)
    print(subgraph_smiles)

输出
11
C=CCC
C=CCO
C=CCN
C=C(C)S
CCCS
OCCS
NCCS
CC(C)O
CC(C)N
CC(N)O
CC(N)O
英文:

A team member of mine was able to generate a code as to access the paths or subgraphs information. To clarify, paths don't have branching where subgraphs do.

from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
AllChem.SetPreferCoordGen(True)
from rdkit.Chem import rdmolops
import csv
mol = Chem.MolFromSmiles(&#39;C=C(S)C(N)(O)C&#39;)
mol
# Find all subgraphs of length 3 in the molecule
#Subgraphs have branching
subgraphs = Chem.FindAllSubgraphsOfLengthN(mol, 3)
#Paths is no branching
#subgraphs = Chem.FindAllPathsOfLengthN(mol, 3)
print(len(subgraphs))
# Print out the connected SMILES for each subgraph
for subgraph in subgraphs:
# Get the subgraph as a new molecule object
sub_mol = Chem.PathToSubmol(mol, subgraph)
# Generate the connected SMILES string for the subgraph
subgraph_smiles = Chem.MolToSmiles(sub_mol, kekuleSmiles=True)
print(subgraph_smiles)
Output
11
C=CCC
C=CCO
C=CCN
C=C(C)S
CCCS
OCCS
NCCS
CC(C)O
CC(C)N
CC(N)O
CC(N)O

huangapple
  • 本文由 发表于 2023年4月20日 00:41:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76056954.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定