创建基于另一个数据集的作者的数据集。

huangapple go评论43阅读模式
英文:

Create datasets based on authors from another dataset

问题

我有一个以以下格式的数据集

       text          author        title 
     -------------------------------------

dt =   text0         author0       title0
       text1         author1       title1
         .             .              .
         .             .              .
         .             .              .  

我想创建不同的单独数据集,其中仅包含一个作者的文本。例如,数据集名称dt1包含author1的文本,dt2包含author2的文本,依此类推。

如果你需要用Python帮你实现这个,我可以帮你。

英文:

I have a dataset in the following format


       text          author        title 
     -------------------------------------

dt =   text0         author0       title0
       text1         author1       title1
         .             .              .
         .             .              .
         .             .              .  

and I would like to create different separate datasets which contain only texts of one author. For example the dataset names dt1 contains the texts of the author1, the dt2 contains texts of the author2, etc.

I would be grateful if you could help me with this using python.

Update:

dt = 
            text	                                 author	       title
-------------------------------------------------------------------------
0	I would like to go to the beach		              George       Beach
1   I was in park few days ago                        Nick         Park
2	I would like to go in uni	                      Peter        University
3   I have be in the airport at 8                     Maria        Airport
                                                    

答案1

得分: 1

请尝试,这是我理解你需要的。

import pandas as pd

data = {
    'text': ['text0', 'text1', 'text2'],
    'author': ['author0', 'author1', 'author1'],
    'title': ['Comunicación', 'Administración', 'Ventas']
}

df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]

df2 = df[df["author"]=="author1"]

list_author = df['author'].unique().tolist()

for x in list_author:
  a = df[df["author"]==x]
  print(a)
英文:

Please try, this is what I understand you require.

import pandas as pd

data = {
    'text' : ['text0', 'text1', 'text2'],
    'author': ['author0', 'author1', 'author1'],
    'title': ['Comunicación', 'Administración', 'Ventas']
}

df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]

df2 = df[df["author"]=="author1"]
print(df1)
print(df2)

Update:

import pandas as pd

data = {
    'text' : ['text0', 'text1', 'text2'],
    'author': ['author0', 'author1', 'author1'],
    'title': ['Comunicación', 'Administración', 'Ventas']
}

df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]

df2 = df[df["author"]=="author1"]

list_author = df['author'].unique().tolist()

for x in list_author:
  a = df[df["author"]==x]
  print(a)

huangapple
  • 本文由 发表于 2023年2月18日 02:35:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/75488104.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定