在pandas中格式化一个包含两行的表格。

huangapple go评论64阅读模式
英文:

Formate a table with two lines in pandas

问题

Sure, here is the translated content:

你好,我有一个类似下面的数据框:

起始  结束  特征      限定符       功能
1     35   CDS   产物       基因1
36    67   CDS   产物       推测肌动蛋白
69    123  CDS   产物       1_假设蛋白
345   562  CDS   产物       2_假设蛋白

我想要将这个表格格式化为如下的样式:

行1:

  • 列1: 起始
  • 列2: 结束
  • 列3: 特征

行2:

  • 列4: 限定符
  • 列5: 功能

并且每一行都由">特征 行号"分隔。

最终,对于这个示例,我应该得到:

>特征 Seq1
1     35   CDS
                 产物   基因1
>特征 Seq2
36    67   CDS     
                 产物   推测肌动蛋白
>特征 Seq3
69    123  CDS
                 产物   1_假设蛋白
>特征 Seq4
345   562  CDS 
                 产物   2_假设蛋白

每一列都应该由制表符分隔。

英文:

Hello I have a dataframe such as

Start End Feature Qualifier Function
1     35  CDS     Product   Gene1
36    67  CDS     Product   Putative_actin
69    123 CDS     Product   1_hypothetical protein
345   562 CDS     Product   2_hypothetical protein

And I would like to format this table as so :

Line1:

  • Column 1: Start
  • Column 2: End
  • Column 3: Feature

Line2:

  • Column 4: Qualifier
  • Column 5: Function

and each separate by the ">Feature Seq_number_of_the_row"

At the end for this example, I should get :

>Feature Seq1
1     35  CDS
              Product   Gene1
>Feature Seq2
36    67  CDS     
              Product   Putative_actin
>Feature Seq3
69    123 CDS
              Product   1_hypothetical protein
>Feature Seq4
345   562 CDS 
              Product   2_hypothetical protein

Each column should be separated by a tab.

答案1

得分: 2

for idx, (_, row) in enumerate(df.iterrows(), 1):
    print(f'>Feature Seq{idx}')
    print(f'{row.Start:<5}{row.End:<5}{row.Feature}')
    print(f'{"":15}{row.Qualifier:<15}{row.Function}')
英文:

With tabulation:

for idx, (_, row) in enumerate(df.iterrows(), 1):
    print(f&#39;&gt;Feature Seq{idx}&#39;)
    print(f&#39;\t{row.Start}\t{row.End}\t{row.Feature}&#39;)
    print(f&#39;\t\t\t\t{row.Qualifier}\t{row.Function}&#39;)

# Output
&gt;Feature Seq1
	1	35	CDS
				Product	Gene1
&gt;Feature Seq2
	36	67	CDS
				Product	Putative_actin
&gt;Feature Seq3
	69	123	CDS
				Product	1_hypothetical protein
&gt;Feature Seq4
	345	562	CDS
				Product	2_hypothetical protein

You can use:

for idx, (_, row) in enumerate(df.iterrows(), 1):
    print(f&#39;&gt;Feature Seq{idx}&#39;)
    print(f&#39;{row.Start:&lt;5}{row.End:&lt;5}{row.Feature}&#39;)
    print(f&#39;{&quot;&quot;:15}{row.Qualifier:&lt;15}{row.Function}&#39;)

# Output

&gt;Feature Seq1
1    35   CDS
               Product        Gene1
&gt;Feature Seq2
36   67   CDS
               Product        Putative_actin
&gt;Feature Seq3
69   123  CDS
               Product        1_hypothetical protein
&gt;Feature Seq4
345  562  CDS
               Product        2_hypothetical protein

答案2

得分: 2

你可以尝试像这样做:

print(*[f"&gt;Feature Seq{i}\n{s}\t{e}\t{f}\n\t\t\t{q}\t{f}"
        for i, (s, e, f, q, f) in enumerate(
            zip(*[df[col] for col in df.columns]), start=1)], sep="\n")

输出:

&gt;Feature Seq1
1	35	CDS
    			Product	Gene1
&gt;Feature Seq2
36	67	CDS
    			Product	Putative_actin
&gt;Feature Seq3
69	123	CDS
    			Product	1_hypothetical protein
&gt;Feature Seq4
345	562	CDS
    			Product	2_hypothetical protein
英文:

You can try something like this :

print(*[f&quot;&gt;Feature Seq{i}\n{s}\t{e}\t{f}\n\t\t\t{q}\t{f}&quot;
        for i, (s, e, f, q, f) in enumerate(
            zip(*[df[col] for col in df.columns]), start=1)], sep=&quot;\n&quot;)

Output :

&gt;Feature Seq1
1	35	CDS
    			Product	Gene1
&gt;Feature Seq2
36	67	CDS
    			Product	Putative_actin
&gt;Feature Seq3
69	123	CDS
    			Product	1_hypothetical protein
&gt;Feature Seq4
345	562	CDS
    			Product	2_hypothetical protein

答案3

得分: 1

你可以将数据框与自身连接并按需要进行格式化:

table = pd.concat([df, df]).sort_index()
table.iloc[::2, 3:] = None
table.iloc[1::2, :3] = None

这段代码会将数据框与自身连接,然后根据条件进行格式化。

英文:

You could concat the dataframe to itself and format as needed:

table = pd.concat([df,df]).sort_index()
table.iloc[::2,3:] = None
table.iloc[1::2,:3] = None

&gt;&gt;&gt; table
   Start    End Feature Qualifier                Function
0    1.0   35.0     CDS      None                    None
0    NaN    NaN    None   Product                   Gene1
1   36.0   67.0     CDS      None                    None
1    NaN    NaN    None   Product          Putative_actin
2   69.0  123.0     CDS      None                    None
2    NaN    NaN    None   Product  1_hypothetical protein
3  345.0  562.0     CDS      None                    None
3    NaN    NaN    None   Product  2_hypothetical protein

huangapple
  • 本文由 发表于 2023年5月10日 21:28:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76219039.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定