英文:
Create datasets based on authors from another dataset
问题
我有一个以以下格式的数据集
text author title
-------------------------------------
dt = text0 author0 title0
text1 author1 title1
. . .
. . .
. . .
我想创建不同的单独数据集,其中仅包含一个作者的文本。例如,数据集名称dt1包含author1的文本,dt2包含author2的文本,依此类推。
如果你需要用Python帮你实现这个,我可以帮你。
英文:
I have a dataset in the following format
text author title
-------------------------------------
dt = text0 author0 title0
text1 author1 title1
. . .
. . .
. . .
and I would like to create different separate datasets which contain only texts of one author. For example the dataset names dt1 contains the texts of the author1, the dt2 contains texts of the author2, etc.
I would be grateful if you could help me with this using python.
Update:
dt =
text author title
-------------------------------------------------------------------------
0 I would like to go to the beach George Beach
1 I was in park few days ago Nick Park
2 I would like to go in uni Peter University
3 I have be in the airport at 8 Maria Airport
答案1
得分: 1
请尝试,这是我理解你需要的。
import pandas as pd
data = {
'text': ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]
df2 = df[df["author"]=="author1"]
list_author = df['author'].unique().tolist()
for x in list_author:
a = df[df["author"]==x]
print(a)
英文:
Please try, this is what I understand you require.
import pandas as pd
data = {
'text' : ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]
df2 = df[df["author"]=="author1"]
print(df1)
print(df2)
Update:
import pandas as pd
data = {
'text' : ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]
df2 = df[df["author"]=="author1"]
list_author = df['author'].unique().tolist()
for x in list_author:
a = df[df["author"]==x]
print(a)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论