无法访问Biopython中成对对齐的单个对齐字符串。

huangapple go评论65阅读模式
英文:

Unable to access individual alignment strings in biopython pairwise align

问题

I understand your request. Here's the translated portion:

"我试图访问由Biopython中的配对比对器生成的配对比对对象中的单个字符串,但一直没有成功。我指的是已经对齐的序列,显示有缺口,就像print(alignment)所显示的那样,但我试图单独获取它们或者进行切片操作。文档规定这是可能的,但我一直在收到错误。

from Bio import Align
aligner = Align.PairwiseAligner(mode='global', gap_score=-5)
my_target = 'CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG'
my_query = 'CAGTGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGGAGCAGCG'
aln = aligner.align(my_target, my_query)
print(aln[0])

结果是:

CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG
|||-||||||||||||||||||.|||||||||||||||||||||-|||||||
CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG

现在,我想单独获取底部行中的“query”序列。我可以访问aln[0].query,但这似乎只是未经对齐的查询序列,没有缺口。

文档规定了配对比对对象应该可迭代以进行切片,但这根本不起作用。

我得到的错误是:

aln.alignment[1]

File c:\Anaconda3\lib\site-packages\Bio\Align\__init__.py:1024, in PairwiseAlignment.__getitem__(self, key)
   1022     raise NotImplementedError
   1023 if isinstance(key, int):
-> 1024     raise NotImplementedError
   1025 if isinstance(key, tuple):
   1026     try:

NotImplementedError:

文档链接:

无法访问Biopython中成对对齐的单个对齐字符串。

我会感激一些帮助和指引。谢谢。"

英文:

I'm trying to access individual strings in the alignment object which is produced by the pairwise aligner in biopython but not getting anywhere. I'm talking about the already aligned sequences showing gaps, as given by the print(alignment), but trying to get them individually or even slice. The documentation stipulates it's possible but I'm getting errors.

from Bio import Align
aligner = Align.PairwiseAligner(mode='global',gap_score=-5)
my_target= 'CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG'
my_query='CAGTGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGGAGCAGCG'
aln= aligner.align(my_target,my_query)
print(aln[0])

The result is:

CAGGTGCAGCTGGTGCAGAGCGGCGCGGAAGTGAAAAAACCGGGCAGCAGCG
|||-||||||||||||||||||.|||||||||||||||||||||-|||||||
CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG

Now, I'd like to get the 'query' sequence in the bottom line individually.
I can access the aln[0].query but this seems to be just the naked query seq not as aligned (with gaps).

The documentation stipulates the alignment object should be iterable to slice it but this simply is not working.

What I'm getting is:

aln.alignment[1]

File c:\Anaconda3\lib\site-packages\Bio\Align\__init__.py:1024, in PairwiseAlignment.__getitem__(self, key)
   1022     raise NotImplementedError
   1023 if isinstance(key, int):
-> 1024     raise NotImplementedError
   1025 if isinstance(key, tuple):
   1026     try:

NotImplementedError:

The doc:

无法访问Biopython中成对对齐的单个对齐字符串。

I'd appreciate some help, pointers.
Cheers.

答案1

得分: 2

在Biopython 1.81和Python 3.11.3中,看起来对齐对象是可迭代的,每次迭代都可以进一步访问对齐的字符串,显示删除/插入操作。

因此,在原始问题的代码中,我使用以下方式:

aln[0][1]

来获取:

CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG

或者对字符串进行切片:

aln[0][1][start:stop]

然而,在Python 3.9.12中这种方式不起作用(尝试aln[0][N]时会出现'Not implemented'错误,其中N为0或1)。

英文:

Answering own question.
In biopython 1.81 and python 3.11.3 it seems the alignment object is iterable and each iteration is iterable further to access the aligned strings showing the deletions/insertions.
So in the code from the original question I'm doing:

aln[0][1]

to get:

CAG-TGCAGCTGGTGCAGAGCGACGCGGAAGTGAAAAAACCGGG-AGCAGCG

or slicing the string:

aln[0][1][start:stop]  

It does not work in python 3.9.12 though ('Not implemented' error when trying aln[0][N]), N being 0 or 1.

huangapple
  • 本文由 发表于 2023年5月30日 03:13:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76359855.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定