英文:
How to convert dataframe <class "str"> to dataframe
问题
我有一个包含名为 "cc" 的列的数据框,其中包含数据框。但我无法操作它,因为它是字符串。
df.cc.iloc[0]
显示了它的内容,但它是一个字符串。
有什么方法可以提取数据框?
我错在哪里?
a = df.cc.iloc[0] print(a.head())
AttributeError: 'str' object has no attribute 'head'
或
a = df.cc.iloc[0] b = df.cc.iloc[0] c = pd.concat([a, b], ignore_index=True)
TypeError: cannot concatenate object of type <class 'str'>; only Series and DataFrame objs are valid
英文:
i've got dataframe with column "cc" that contain dataframes. But i can't manipulate it because it is str.
Unnamed: 0 a b cc
0 0 51 101 Unnamed: 0 ...
1 1 51 102 Unnamed: 0 ...
2 2 52 101 Unnamed: 0 ...
3 3 52 102 Unnamed: 0 ...
df.cc.iloc[0]
shows me what it contains, but it is a string:
What is the way to extract dataframe?
where am i wrong?
a = df.cc.iloc[0]
print(a.head())
AttributeError: 'str' object has no attribute 'head'
or
a = df.cc.iloc[0]
b = df.cc.iloc[0]
c = pd.concat([a, b], ignore_index=True)
TypeError: cannot concatenate object of type <class 'str'>; only Series and DataFrame objs are valid
答案1
得分: 0
如果选择列的第一个值为单个值 - 字符串,因此无法使用pandas函数进行标量操作。
df = pd.DataFrame({'cc':['a','b']})
print(df.cc.iloc[0])
a
如果确实需要一个元素的Series,请使用双重 []
:
a = df.cc.iloc[[0]]
print(a)
0 a
Name: cc, dtype: object
对于一个元素的DataFrame:
a = df[['cc']].iloc[[0]]
print(a)
cc
0 a
通过DataFrame构造函数生成:
a = df.cc.iloc[0]
print(a)
df1 = pd.DataFrame({'col':[a]})
print(df1)
col
0 a
如果要为cc
列的每个值创建一个新的DataFrame,这不是最佳做法,但如果选择第一个值,将返回一个DataFrame:
def qwe(y,z):
df = pd.DataFrame({"one":range(1,10), "two":range(2,11), "three":range(3,12})
df["four"] = (df.one * (y+z)) + (df.two * (y+z)) + (df.three * (y+z))
return df
fr = pd.DataFrame(columns=["a", "b"])
for aa in range(1, 5):
for bb in range(1, 5):
x = [aa, bb]
fr.loc[len(fr)] = x
#对于cc列中的每个值都设置DataFrame
fr["cc"] = fr.apply(lambda x: qwe(x['a'], x['b']), axis=1)
print(fr.cc)
如果需要连接fr.cc
列中的每个DataFrame,使用concat
和列表理解:
out = pd.concat([x for x in fr.cc])
print(out)
如果cc
列混合了字符串和DataFrame:
out = pd.concat([x for x in fr.cc if isinstance(x, pd.DataFrame)])
print(out)
英文:
If select first value of column ouput is single value - here string. So cannot use pandas functions for scalar.
df = pd.DataFrame({'cc':['a','b']})
print (df.cc.iloc[0])
a
If really need one element Series use double []
:
a = df.cc.iloc[[0]]
print (a)
0 a
Name: cc, dtype: object
Of one element DataFrame:
a = df[['cc']].iloc[[0]]
print (a)
cc
0 a
Generate by DataFrame constructor:
a = df.cc.iloc[0]
print (a)
a
df1 = pd.DataFrame({'col':[a]})
print (df1)
col
0 a
If fill each value of cc
column full new DataFrame it is not best practices, but if select first value is returned DataFrame:
def qwe(y,z):
df = pd.DataFrame({"one":range(1,10), "two":range(2,11), "three":range(3,12)})
df["four"] = (df.one * (y+z)) + (df.two * (y+z)) + (df.three * (y+z))
return df
fr = pd.DataFrame(columns=["a", "b"])
for aa in range(1, 5):
for bb in range(1, 5):
x = [aa,bb]
fr.loc[len(fr)] = x
#for each value in cc is set DataFrame
fr["cc"] = fr.apply(lambda x: qwe(x['a'], x['b']), axis=1)
print(fr.cc)
0 one two three four
0 1 2 3 ...
1 one two three four
0 1 2 3 ...
2 one two three four
0 1 2 3 ...
3 one two three four
0 1 2 3 ...
4 one two three four
0 1 2 3 ...
5 one two three four
0 1 2 3 ...
6 one two three four
0 1 2 3 ...
7 one two three four
0 1 2 3 ...
8 one two three four
0 1 2 3 ...
9 one two three four
0 1 2 3 ...
10 one two three four
0 1 2 3 ...
11 one two three four
0 1 2 3 ...
12 one two three four
0 1 2 3 ...
13 one two three four
0 1 2 3 ...
14 one two three four
0 1 2 3 ...
15 one two three four
0 1 2 3 ...
Name: cc, dtype: object
print(fr.cc.iloc[0])
one two three four
0 1 2 3 12
1 2 3 4 18
2 3 4 5 24
3 4 5 6 30
4 5 6 7 36
5 6 7 8 42
6 7 8 9 48
7 8 9 10 54
8 9 10 11 60
EDIT: If need join each DataFrame from fr.cc
column use concat
with list comprehension:
out = pd.concat([x for x in fr.cc])
print(out)
one two three four
0 1 2 3 12
1 2 3 4 18
2 3 4 5 24
3 4 5 6 30
4 5 6 7 36
.. ... ... ... ...
4 5 6 7 144
5 6 7 8 168
6 7 8 9 192
7 8 9 10 216
8 9 10 11 240
[144 rows x 4 columns]
If mixed cc
column - DataFrames with strings:
out = pd.concat([x for x in fr.cc if isinstance(x, pd.DataFrame)])
print(out)
one two three four
0 1 2 3 12
1 2 3 4 18
2 3 4 5 24
3 4 5 6 30
4 5 6 7 36
.. ... ... ... ...
4 5 6 7 144
5 6 7 8 168
6 7 8 9 192
7 8 9 10 216
8 9 10 11 240
[144 rows x 4 columns]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论