英文:
Formate a table with two lines in pandas
问题
Sure, here is the translated content:
你好,我有一个类似下面的数据框:
起始 结束 特征 限定符 功能
1 35 CDS 产物 基因1
36 67 CDS 产物 推测肌动蛋白
69 123 CDS 产物 1_假设蛋白
345 562 CDS 产物 2_假设蛋白
我想要将这个表格格式化为如下的样式:
行1:
- 列1: 起始
- 列2: 结束
- 列3: 特征
行2:
- 列4: 限定符
- 列5: 功能
并且每一行都由">特征 行号
"分隔。
最终,对于这个示例,我应该得到:
>特征 Seq1
1 35 CDS
产物 基因1
>特征 Seq2
36 67 CDS
产物 推测肌动蛋白
>特征 Seq3
69 123 CDS
产物 1_假设蛋白
>特征 Seq4
345 562 CDS
产物 2_假设蛋白
每一列都应该由制表符分隔。
英文:
Hello I have a dataframe such as
Start End Feature Qualifier Function
1 35 CDS Product Gene1
36 67 CDS Product Putative_actin
69 123 CDS Product 1_hypothetical protein
345 562 CDS Product 2_hypothetical protein
And I would like to format this table as so :
Line1:
- Column 1: Start
- Column 2: End
- Column 3: Feature
Line2:
- Column 4: Qualifier
- Column 5: Function
and each separate by the ">Feature Seq_number_of_the_row
"
At the end for this example, I should get :
>Feature Seq1
1 35 CDS
Product Gene1
>Feature Seq2
36 67 CDS
Product Putative_actin
>Feature Seq3
69 123 CDS
Product 1_hypothetical protein
>Feature Seq4
345 562 CDS
Product 2_hypothetical protein
Each column should be separated by a tab.
答案1
得分: 2
for idx, (_, row) in enumerate(df.iterrows(), 1):
print(f'>Feature Seq{idx}')
print(f'{row.Start:<5}{row.End:<5}{row.Feature}')
print(f'{"":15}{row.Qualifier:<15}{row.Function}')
英文:
With tabulation:
for idx, (_, row) in enumerate(df.iterrows(), 1):
print(f'>Feature Seq{idx}')
print(f'\t{row.Start}\t{row.End}\t{row.Feature}')
print(f'\t\t\t\t{row.Qualifier}\t{row.Function}')
# Output
>Feature Seq1
1 35 CDS
Product Gene1
>Feature Seq2
36 67 CDS
Product Putative_actin
>Feature Seq3
69 123 CDS
Product 1_hypothetical protein
>Feature Seq4
345 562 CDS
Product 2_hypothetical protein
You can use:
for idx, (_, row) in enumerate(df.iterrows(), 1):
print(f'>Feature Seq{idx}')
print(f'{row.Start:<5}{row.End:<5}{row.Feature}')
print(f'{"":15}{row.Qualifier:<15}{row.Function}')
# Output
>Feature Seq1
1 35 CDS
Product Gene1
>Feature Seq2
36 67 CDS
Product Putative_actin
>Feature Seq3
69 123 CDS
Product 1_hypothetical protein
>Feature Seq4
345 562 CDS
Product 2_hypothetical protein
答案2
得分: 2
你可以尝试像这样做:
print(*[f">Feature Seq{i}\n{s}\t{e}\t{f}\n\t\t\t{q}\t{f}"
for i, (s, e, f, q, f) in enumerate(
zip(*[df[col] for col in df.columns]), start=1)], sep="\n")
输出:
>Feature Seq1
1 35 CDS
Product Gene1
>Feature Seq2
36 67 CDS
Product Putative_actin
>Feature Seq3
69 123 CDS
Product 1_hypothetical protein
>Feature Seq4
345 562 CDS
Product 2_hypothetical protein
英文:
You can try something like this :
print(*[f">Feature Seq{i}\n{s}\t{e}\t{f}\n\t\t\t{q}\t{f}"
for i, (s, e, f, q, f) in enumerate(
zip(*[df[col] for col in df.columns]), start=1)], sep="\n")
Output :
>Feature Seq1
1 35 CDS
Product Gene1
>Feature Seq2
36 67 CDS
Product Putative_actin
>Feature Seq3
69 123 CDS
Product 1_hypothetical protein
>Feature Seq4
345 562 CDS
Product 2_hypothetical protein
答案3
得分: 1
你可以将数据框与自身连接并按需要进行格式化:
table = pd.concat([df, df]).sort_index()
table.iloc[::2, 3:] = None
table.iloc[1::2, :3] = None
这段代码会将数据框与自身连接,然后根据条件进行格式化。
英文:
You could concat the dataframe to itself and format as needed:
table = pd.concat([df,df]).sort_index()
table.iloc[::2,3:] = None
table.iloc[1::2,:3] = None
>>> table
Start End Feature Qualifier Function
0 1.0 35.0 CDS None None
0 NaN NaN None Product Gene1
1 36.0 67.0 CDS None None
1 NaN NaN None Product Putative_actin
2 69.0 123.0 CDS None None
2 NaN NaN None Product 1_hypothetical protein
3 345.0 562.0 CDS None None
3 NaN NaN None Product 2_hypothetical protein
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论