英文:
Unable to access individual alignment strings in biopython pairwise align
问题
I understand your request. Here's the translated portion:
"我试图访问由Biopython中的配对比对器生成的配对比对对象中的单个字符串,但一直没有成功。我指的是已经对齐的序列,显示有缺口,就像print(alignment)所显示的那样,但我试图单独获取它们或者进行切片操作。文档规定这是可能的,但我一直在收到错误。
from Bio import Align
aligner = Align.PairwiseAligner(mode='global', gap_score=-5)
my_target = 'CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG'
my_query = 'CAGTGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGGAGCAGCG'
aln = aligner.align(my_target, my_query)
print(aln[0])
结果是:
CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG
|||-||||||||||||||||||.|||||||||||||||||||||-|||||||
CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG
现在,我想单独获取底部行中的“query”序列。我可以访问aln[0].query,但这似乎只是未经对齐的查询序列,没有缺口。
文档规定了配对比对对象应该可迭代以进行切片,但这根本不起作用。
我得到的错误是:
aln.alignment[1]
File c:\Anaconda3\lib\site-packages\Bio\Align\__init__.py:1024, in PairwiseAlignment.__getitem__(self, key)
1022 raise NotImplementedError
1023 if isinstance(key, int):
-> 1024 raise NotImplementedError
1025 if isinstance(key, tuple):
1026 try:
NotImplementedError:
文档链接:
我会感激一些帮助和指引。谢谢。"
英文:
I'm trying to access individual strings in the alignment object which is produced by the pairwise aligner in biopython but not getting anywhere. I'm talking about the already aligned sequences showing gaps, as given by the print(alignment), but trying to get them individually or even slice. The documentation stipulates it's possible but I'm getting errors.
from Bio import Align
aligner = Align.PairwiseAligner(mode='global',gap_score=-5)
my_target= 'CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG'
my_query='CAGTGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGGAGCAGCG'
aln= aligner.align(my_target,my_query)
print(aln[0])
The result is:
CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG
|||-||||||||||||||||||.|||||||||||||||||||||-|||||||
CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG
Now, I'd like to get the 'query' sequence in the bottom line individually.
I can access the aln[0].query but this seems to be just the naked query seq not as aligned (with gaps).
The documentation stipulates the alignment object should be iterable to slice it but this simply is not working.
What I'm getting is:
aln.alignment[1]
File c:\Anaconda3\lib\site-packages\Bio\Align\__init__.py:1024, in PairwiseAlignment.__getitem__(self, key)
1022 raise NotImplementedError
1023 if isinstance(key, int):
-> 1024 raise NotImplementedError
1025 if isinstance(key, tuple):
1026 try:
NotImplementedError:
The doc:
I'd appreciate some help, pointers.
Cheers.
答案1
得分: 2
在Biopython 1.81和Python 3.11.3中,看起来对齐对象是可迭代的,每次迭代都可以进一步访问对齐的字符串,显示删除/插入操作。
因此,在原始问题的代码中,我使用以下方式:
aln[0][1]
来获取:
CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG
或者对字符串进行切片:
aln[0][1][start:stop]
然而,在Python 3.9.12中这种方式不起作用(尝试aln[0][N]
时会出现'Not implemented'错误,其中N为0或1)。
英文:
Answering own question.
In biopython 1.81 and python 3.11.3 it seems the alignment object is iterable and each iteration is iterable further to access the aligned strings showing the deletions/insertions.
So in the code from the original question I'm doing:
aln[0][1]
to get:
CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG
or slicing the string:
aln[0][1][start:stop]
It does not work in python 3.9.12 though ('Not implemented' error when trying aln[0][N]), N being 0 or 1.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论