2023年5月10日 21:28:50go评论64阅读模式

英文:

Formate a table with two lines in pandas

问题

Sure, here is the translated content:

你好，我有一个类似下面的数据框：

起始  结束  特征      限定符       功能
1     35   CDS   产物       基因1
36    67   CDS   产物       推测肌动蛋白
69    123  CDS   产物       1_假设蛋白
345   562  CDS   产物       2_假设蛋白

我想要将这个表格格式化为如下的样式：

行1:

列1: 起始
列2: 结束
列3: 特征

行2:

列4: 限定符
列5: 功能

并且每一行都由">特征 行号"分隔。

最终，对于这个示例，我应该得到：

&gt;特征 Seq1
1     35   CDS
                 产物   基因1
&gt;特征 Seq2
36    67   CDS     
                 产物   推测肌动蛋白
&gt;特征 Seq3
69    123  CDS
                 产物   1_假设蛋白
&gt;特征 Seq4
345   562  CDS 
                 产物   2_假设蛋白

每一列都应该由制表符分隔。

英文:

Hello I have a dataframe such as

Start End Feature Qualifier Function
1     35  CDS     Product   Gene1
36    67  CDS     Product   Putative_actin
69    123 CDS     Product   1_hypothetical protein
345   562 CDS     Product   2_hypothetical protein

And I would like to format this table as so :

Line1:

Column 1: Start
Column 2: End
Column 3: Feature

Line2:

Column 4: Qualifier
Column 5: Function

and each separate by the ">Feature Seq_number_of_the_row"

At the end for this example, I should get :

&gt;Feature Seq1
1     35  CDS
              Product   Gene1
&gt;Feature Seq2
36    67  CDS     
              Product   Putative_actin
&gt;Feature Seq3
69    123 CDS
              Product   1_hypothetical protein
&gt;Feature Seq4
345   562 CDS 
              Product   2_hypothetical protein

Each column should be separated by a tab.

答案1

得分: 2

for idx, (_, row) in enumerate(df.iterrows(), 1):
    print(f'&gt;Feature Seq{idx}')
    print(f'{row.Start:<5}{row.End:<5}{row.Feature}')
    print(f'{"":15}{row.Qualifier:<15}{row.Function}')

英文:

With tabulation:

for idx, (_, row) in enumerate(df.iterrows(), 1):
    print(f&#39;&gt;Feature Seq{idx}&#39;)
    print(f&#39;\t{row.Start}\t{row.End}\t{row.Feature}&#39;)
    print(f&#39;\t\t\t\t{row.Qualifier}\t{row.Function}&#39;)

# Output
&gt;Feature Seq1
	1	35	CDS
				Product	Gene1
&gt;Feature Seq2
	36	67	CDS
				Product	Putative_actin
&gt;Feature Seq3
	69	123	CDS
				Product	1_hypothetical protein
&gt;Feature Seq4
	345	562	CDS
				Product	2_hypothetical protein

You can use:

for idx, (_, row) in enumerate(df.iterrows(), 1):
    print(f&#39;&gt;Feature Seq{idx}&#39;)
    print(f&#39;{row.Start:&lt;5}{row.End:&lt;5}{row.Feature}&#39;)
    print(f&#39;{&quot;&quot;:15}{row.Qualifier:&lt;15}{row.Function}&#39;)

# Output

&gt;Feature Seq1
1    35   CDS
               Product        Gene1
&gt;Feature Seq2
36   67   CDS
               Product        Putative_actin
&gt;Feature Seq3
69   123  CDS
               Product        1_hypothetical protein
&gt;Feature Seq4
345  562  CDS
               Product        2_hypothetical protein

答案2

得分: 2

你可以尝试像这样做：

print(*[f"&gt;Feature Seq{i}\n{s}\t{e}\t{f}\n\t\t\t{q}\t{f}"
        for i, (s, e, f, q, f) in enumerate(
            zip(*[df[col] for col in df.columns]), start=1)], sep="\n")

输出：

&gt;Feature Seq1
1	35	CDS
    			Product	Gene1
&gt;Feature Seq2
36	67	CDS
    			Product	Putative_actin
&gt;Feature Seq3
69	123	CDS
    			Product	1_hypothetical protein
&gt;Feature Seq4
345	562	CDS
    			Product	2_hypothetical protein

英文:

You can try something like this :

print(*[f&quot;&gt;Feature Seq{i}\n{s}\t{e}\t{f}\n\t\t\t{q}\t{f}&quot;
        for i, (s, e, f, q, f) in enumerate(
            zip(*[df[col] for col in df.columns]), start=1)], sep=&quot;\n&quot;)

Output :

&gt;Feature Seq1
1	35	CDS
    			Product	Gene1
&gt;Feature Seq2
36	67	CDS
    			Product	Putative_actin
&gt;Feature Seq3
69	123	CDS
    			Product	1_hypothetical protein
&gt;Feature Seq4
345	562	CDS
    			Product	2_hypothetical protein

答案3

得分: 1

你可以将数据框与自身连接并按需要进行格式化：

table = pd.concat([df, df]).sort_index()
table.iloc[::2, 3:] = None
table.iloc[1::2, :3] = None

这段代码会将数据框与自身连接，然后根据条件进行格式化。

英文:

You could concat the dataframe to itself and format as needed:

table = pd.concat([df,df]).sort_index()
table.iloc[::2,3:] = None
table.iloc[1::2,:3] = None

&gt;&gt;&gt; table
   Start    End Feature Qualifier                Function
0    1.0   35.0     CDS      None                    None
0    NaN    NaN    None   Product                   Gene1
1   36.0   67.0     CDS      None                    None
1    NaN    NaN    None   Product          Putative_actin
2   69.0  123.0     CDS      None                    None
2    NaN    NaN    None   Product  1_hypothetical protein
3  345.0  562.0     CDS      None                    None
3    NaN    NaN    None   Product  2_hypothetical protein

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在pandas中格式化一个包含两行的表格。

问题

答案1

答案2

答案3

Python代码复制DOS Copy命令 – 二进制

找出数据集中的最大组合数。

定义一个用于需求模式的函数。

多边形图表

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论