2023年3月31日 04:48:02go评论75阅读模式

英文:

How can we melt a dataframe and list words under columns?

问题

我有一个看起来像这样的数据框。

import pandas as pd

data = {'clean_words':['good','evening','how','are','you','how','can','i','help'],
        'start_time':[1900,2100,2500,2750,2900,1500,1650,1770,1800],
        'end_time':[2100,2500,2750,2900,3000,1650,1770,1800,1950],
        'transaction':[1,1,1,1,1,2,2,2,2]}

df = pd.DataFrame(data)
df

如果我尝试基本的melt操作，如下所示...

df_melted = df.pivot_table(index='clean_words', columns='transaction')
df_melted.tail()

我得到这个...

我真正想要的是将交易号作为列，然后按单词列出。因此，如果transaction1是列，这些单词将在该列下列出：

'good','evening','how','are','you'

在transaction2下，这些单词将在该列下列出：

'how','can','i','help'

我该如何做呢？这里的start_time和end_time有点多余。

英文:

I have a dataframe that looks like this.

import pandas as pd

data = {&#39;clean_words&#39;:[&#39;good&#39;,&#39;evening&#39;,&#39;how&#39;,&#39;are&#39;,&#39;you&#39;,&#39;how&#39;,&#39;can&#39;,&#39;i&#39;,&#39;help&#39;],
        &#39;start_time&#39;:[1900,2100,2500,2750,2900,1500,1650,1770,1800],
        &#39;end_time&#39;:[2100,2500,2750,2900,3000,1650,1770,1800,1950],
        &#39;transaction&#39;:[1,1,1,1,1,2,2,2,2]}

df = pd.DataFrame(data)
df

If I try a basic melt, like so...

df_melted = df.pivot_table(index=&#39;clean_words&#39;, columns=&#39;transaction&#39;)
df_melted.tail()

I get this...

What I really want is the transaction number as columns and the words listed down. So, if transaction1 was the column, these words would be listed in rows, under that column:

`&#39;good&#39;,&#39;evening&#39;,&#39;how&#39;,&#39;are&#39;,&#39;you&#39;`

Under transaction2, these words would be listed in rows, under that column:

&#39;how&#39;,&#39;can&#39;,&#39;i&#39;,&#39;help&#39;

How can I do that? The start_time and end_time are kind of superfluous here.

答案1

得分: 1

这是您想要的格式吗？

&gt;&gt;&gt; pd.DataFrame({'1': ['good', 'evening', 'how', 'are', 'you'], '2': ['how', 'can', 'I', 'help', None]})
     1     2
0  good   how
1 evening   can
2    how     I
3    are  help
4    you  None

我以后可以将您提供的内容翻译成中文。

英文:

Is this the format you want?

&gt;&gt;&gt; pd.DataFrame({&#39;1&#39;: [&#39;good&#39;, &#39;evening&#39;, &#39;how&#39;, &#39;are&#39;, &#39;you&#39;], &#39;2&#39;: [&#39;how&#39;, &#39;can&#39;, &#39;I&#39;, &#39;help&#39;, None]})
         1     2
0     good   how
1  evening   can
2      how     I
3      are  help
4      you  None

I haven't done that before but you could pivot your data and collect a list of words under each transaction column.

&gt;&gt;&gt; df.pivot_table(columns=&#39;transaction&#39;, values=&#39;clean_words&#39;, aggfunc=list)
transaction                               1                    2
clean_words  [good, evening, how, are, you]  [how, can, i, help]

Or group by transaction and collect a list of words.

&gt;&gt;&gt; df.groupby(&#39;transaction&#39;, as_index=False).agg(clean_words=pd.NamedAgg(column=&#39;clean_words&#39;, aggfunc=list))
   transaction                     clean_words
0            1  [good, evening, how, are, you]
1            2             [how, can, i, help]

答案2

得分: 1

import pandas as pd
import numpy as np

data = {'clean_words': ['good', 'evening', 'how', 'are', 'you', 'how', 'can', 'i', 'help'],
        'start_time': [1900, 2100, 2500, 2750, 2900, 1500, 1650, 1770, 1800],
        'end_time': [2100, 2500, 2750, 2900, 3000, 1650, 1770, 1800, 1950],
        'transaction': [1, 1, 1, 1, 1, 2, 2, 2, 2]}

df = pd.DataFrame(data)

df_melted = df.groupby('transaction')['clean_words'].apply(np.array).reset_index()

print(df_melted)

英文:

import pandas as pd
import numpy as np

data = {&#39;clean_words&#39;:[&#39;good&#39;,&#39;evening&#39;,&#39;how&#39;,&#39;are&#39;,&#39;you&#39;,&#39;how&#39;,&#39;can&#39;,&#39;i&#39;,&#39;help&#39;],
        &#39;start_time&#39;:[1900,2100,2500,2750,2900,1500,1650,1770,1800],
        &#39;end_time&#39;:[2100,2500,2750,2900,3000,1650,1770,1800,1950],
        &#39;transaction&#39;:[1,1,1,1,1,2,2,2,2]}

df = pd.DataFrame(data)

df_melted = df.groupby(&#39;transaction&#39;)[&#39;clean_words&#39;].apply(np.array).reset_index()

print(df_melted)

transaction                     clean_words
0            1  [good, evening, how, are, you]
1            2             [how, can, i, help]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何融化数据框并列出列下的单词？

问题

答案1

答案2

创建分类之间的层次结构。

Why cant python gui modules handle while true loops is it a python problem or a module problem?

Someone else is working in "path" right now excel after saving it on a shared drive in python using to_excel

使用sympy对一系列符号求和

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论