创建基于另一个数据集的作者的数据集。

huangapple go评论77阅读模式
英文:

Create datasets based on authors from another dataset

问题

我有一个以以下格式的数据集

  1. text author title
  2. -------------------------------------
  3. dt = text0 author0 title0
  4. text1 author1 title1
  5. . . .
  6. . . .
  7. . . .

我想创建不同的单独数据集,其中仅包含一个作者的文本。例如,数据集名称dt1包含author1的文本,dt2包含author2的文本,依此类推。

如果你需要用Python帮你实现这个,我可以帮你。

英文:

I have a dataset in the following format

  1. text author title
  2. -------------------------------------
  3. dt = text0 author0 title0
  4. text1 author1 title1
  5. . . .
  6. . . .
  7. . . .

and I would like to create different separate datasets which contain only texts of one author. For example the dataset names dt1 contains the texts of the author1, the dt2 contains texts of the author2, etc.

I would be grateful if you could help me with this using python.

Update:

  1. dt =
  2. text author title
  3. -------------------------------------------------------------------------
  4. 0 I would like to go to the beach George Beach
  5. 1 I was in park few days ago Nick Park
  6. 2 I would like to go in uni Peter University
  7. 3 I have be in the airport at 8 Maria Airport

答案1

得分: 1

请尝试,这是我理解你需要的。

  1. import pandas as pd
  2. data = {
  3. 'text': ['text0', 'text1', 'text2'],
  4. 'author': ['author0', 'author1', 'author1'],
  5. 'title': ['Comunicación', 'Administración', 'Ventas']
  6. }
  7. df = pd.DataFrame(data)
  8. df1 = df[df["author"]=="author0"]
  9. df2 = df[df["author"]=="author1"]
  10. list_author = df['author'].unique().tolist()
  11. for x in list_author:
  12. a = df[df["author"]==x]
  13. print(a)
英文:

Please try, this is what I understand you require.

  1. import pandas as pd
  2. data = {
  3. 'text' : ['text0', 'text1', 'text2'],
  4. 'author': ['author0', 'author1', 'author1'],
  5. 'title': ['Comunicación', 'Administración', 'Ventas']
  6. }
  7. df = pd.DataFrame(data)
  8. df1 = df[df["author"]=="author0"]
  9. df2 = df[df["author"]=="author1"]
  10. print(df1)
  11. print(df2)

Update:

  1. import pandas as pd
  2. data = {
  3. 'text' : ['text0', 'text1', 'text2'],
  4. 'author': ['author0', 'author1', 'author1'],
  5. 'title': ['Comunicación', 'Administración', 'Ventas']
  6. }
  7. df = pd.DataFrame(data)
  8. df1 = df[df["author"]=="author0"]
  9. df2 = df[df["author"]=="author1"]
  10. list_author = df['author'].unique().tolist()
  11. for x in list_author:
  12. a = df[df["author"]==x]
  13. print(a)

huangapple
  • 本文由 发表于 2023年2月18日 02:35:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/75488104.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定