如何融化数据框并列出列下的单词?

huangapple go评论99阅读模式
英文:

How can we melt a dataframe and list words under columns?

问题

我有一个看起来像这样的数据框。

  1. import pandas as pd
  2. data = {'clean_words':['good','evening','how','are','you','how','can','i','help'],
  3. 'start_time':[1900,2100,2500,2750,2900,1500,1650,1770,1800],
  4. 'end_time':[2100,2500,2750,2900,3000,1650,1770,1800,1950],
  5. 'transaction':[1,1,1,1,1,2,2,2,2]}
  6. df = pd.DataFrame(data)
  7. df

如果我尝试基本的melt操作,如下所示...

  1. df_melted = df.pivot_table(index='clean_words', columns='transaction')
  2. df_melted.tail()

我得到这个...

我真正想要的是将交易号作为列,然后按单词列出。因此,如果transaction1是列,这些单词将在该列下列出:

'good','evening','how','are','you'

在transaction2下,这些单词将在该列下列出:

'how','can','i','help'

我该如何做呢?这里的start_time和end_time有点多余。

英文:

I have a dataframe that looks like this.

  1. import pandas as pd
  2. data = {'clean_words':['good','evening','how','are','you','how','can','i','help'],
  3. 'start_time':[1900,2100,2500,2750,2900,1500,1650,1770,1800],
  4. 'end_time':[2100,2500,2750,2900,3000,1650,1770,1800,1950],
  5. 'transaction':[1,1,1,1,1,2,2,2,2]}
  6. df = pd.DataFrame(data)
  7. df

如何融化数据框并列出列下的单词?

If I try a basic melt, like so...

  1. df_melted = df.pivot_table(index='clean_words', columns='transaction')
  2. df_melted.tail()

I get this...

如何融化数据框并列出列下的单词?

What I really want is the transaction number as columns and the words listed down. So, if transaction1 was the column, these words would be listed in rows, under that column:

  1. `'good','evening','how','are','you'`

Under transaction2, these words would be listed in rows, under that column:

  1. 'how','can','i','help'

How can I do that? The start_time and end_time are kind of superfluous here.

答案1

得分: 1

这是您想要的格式吗?

  1. >>> pd.DataFrame({'1': ['good', 'evening', 'how', 'are', 'you'], '2': ['how', 'can', 'I', 'help', None]})
  2. 1 2
  3. 0 good how
  4. 1 evening can
  5. 2 how I
  6. 3 are help
  7. 4 you None

我以后可以将您提供的内容翻译成中文。

英文:

Is this the format you want?

  1. >>> pd.DataFrame({'1': ['good', 'evening', 'how', 'are', 'you'], '2': ['how', 'can', 'I', 'help', None]})
  2. 1 2
  3. 0 good how
  4. 1 evening can
  5. 2 how I
  6. 3 are help
  7. 4 you None

I haven't done that before but you could pivot your data and collect a list of words under each transaction column.

  1. >>> df.pivot_table(columns='transaction', values='clean_words', aggfunc=list)
  2. transaction 1 2
  3. clean_words [good, evening, how, are, you] [how, can, i, help]

Or group by transaction and collect a list of words.

  1. >>> df.groupby('transaction', as_index=False).agg(clean_words=pd.NamedAgg(column='clean_words', aggfunc=list))
  2. transaction clean_words
  3. 0 1 [good, evening, how, are, you]
  4. 1 2 [how, can, i, help]

答案2

得分: 1

  1. import pandas as pd
  2. import numpy as np
  3. data = {'clean_words': ['good', 'evening', 'how', 'are', 'you', 'how', 'can', 'i', 'help'],
  4. 'start_time': [1900, 2100, 2500, 2750, 2900, 1500, 1650, 1770, 1800],
  5. 'end_time': [2100, 2500, 2750, 2900, 3000, 1650, 1770, 1800, 1950],
  6. 'transaction': [1, 1, 1, 1, 1, 2, 2, 2, 2]}
  7. df = pd.DataFrame(data)
  8. df_melted = df.groupby('transaction')['clean_words'].apply(np.array).reset_index()
  9. print(df_melted)
英文:
  1. import pandas as pd
  2. import numpy as np
  3. data = {'clean_words':['good','evening','how','are','you','how','can','i','help'],
  4. 'start_time':[1900,2100,2500,2750,2900,1500,1650,1770,1800],
  5. 'end_time':[2100,2500,2750,2900,3000,1650,1770,1800,1950],
  6. 'transaction':[1,1,1,1,1,2,2,2,2]}
  7. df = pd.DataFrame(data)
  8. df_melted = df.groupby('transaction')['clean_words'].apply(np.array).reset_index()
  9. print(df_melted)
  10. transaction clean_words
  11. 0 1 [good, evening, how, are, you]
  12. 1 2 [how, can, i, help]

huangapple
  • 本文由 发表于 2023年3月31日 04:48:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/75892874-2.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定