如何将数据框 <class "str"> 转换为数据框

huangapple go评论141阅读模式
英文:

How to convert dataframe <class "str"> to dataframe

问题

我有一个包含名为 "cc" 的列的数据框,其中包含数据框。但我无法操作它,因为它是字符串。

df.cc.iloc[0] 显示了它的内容,但它是一个字符串。

有什么方法可以提取数据框?
我错在哪里?

a = df.cc.iloc[0] print(a.head())

AttributeError: 'str' object has no attribute 'head'

a = df.cc.iloc[0] b = df.cc.iloc[0] c = pd.concat([a, b], ignore_index=True)

TypeError: cannot concatenate object of type <class 'str'>; only Series and DataFrame objs are valid

英文:

i've got dataframe with column "cc" that contain dataframes. But i can't manipulate it because it is str.

   Unnamed: 0   a    b                    cc
0           0  51  101 Unnamed: 0         ...
1           1  51  102 Unnamed: 0         ...
2           2  52  101 Unnamed: 0         ...
3           3  52  102 Unnamed: 0         ...

df.cc.iloc[0] shows me what it contains, but it is a string:

What is the way to extract dataframe?
where am i wrong?

a = df.cc.iloc[0]
print(a.head())

AttributeError: 'str' object has no attribute 'head'

or

a = df.cc.iloc[0]
b = df.cc.iloc[0]
c = pd.concat([a, b], ignore_index=True)

TypeError: cannot concatenate object of type <class 'str'>; only Series and DataFrame objs are valid

答案1

得分: 0

如果选择列的第一个值为单个值 - 字符串,因此无法使用pandas函数进行标量操作。

df = pd.DataFrame({'cc':['a','b']})

print(df.cc.iloc[0])
a

如果确实需要一个元素的Series,请使用双重 []

a = df.cc.iloc[[0]]
print(a)
0    a
Name: cc, dtype: object

对于一个元素的DataFrame:

a = df[['cc']].iloc[[0]]
print(a)
  cc
0  a

通过DataFrame构造函数生成:

a = df.cc.iloc[0]
print(a)

df1 = pd.DataFrame({'col':[a]})
print(df1)
  col
0   a

如果要为cc列的每个值创建一个新的DataFrame,这不是最佳做法,但如果选择第一个值,将返回一个DataFrame:

def qwe(y,z): 
    df = pd.DataFrame({"one":range(1,10), "two":range(2,11), "three":range(3,12}) 
    df["four"] = (df.one * (y+z)) + (df.two * (y+z)) + (df.three * (y+z)) 
    return df 

fr = pd.DataFrame(columns=["a", "b"]) 

for aa in range(1, 5): 
    for bb in range(1, 5): 
        x = [aa, bb] 
        fr.loc[len(fr)] = x 

#对于cc列中的每个值都设置DataFrame
fr["cc"] = fr.apply(lambda x: qwe(x['a'], x['b']), axis=1) 
print(fr.cc)

如果需要连接fr.cc列中的每个DataFrame,使用concat和列表理解:

out = pd.concat([x for x in fr.cc])
print(out)

如果cc列混合了字符串和DataFrame:

out = pd.concat([x for x in fr.cc if isinstance(x, pd.DataFrame)])
print(out)
英文:

If select first value of column ouput is single value - here string. So cannot use pandas functions for scalar.

df = pd.DataFrame({&#39;cc&#39;:[&#39;a&#39;,&#39;b&#39;]})
                         
print (df.cc.iloc[0])
a

If really need one element Series use double []:

a = df.cc.iloc[[0]]
print (a)
0    a
Name: cc, dtype: object

Of one element DataFrame:

a = df[[&#39;cc&#39;]].iloc[[0]]
print (a)
  cc
0  a

Generate by DataFrame constructor:

a = df.cc.iloc[0]
print (a)
a

df1 = pd.DataFrame({&#39;col&#39;:[a]})
print (df1)
  col
0   a

If fill each value of cc column full new DataFrame it is not best practices, but if select first value is returned DataFrame:

def qwe(y,z): 
    df = pd.DataFrame({&quot;one&quot;:range(1,10), &quot;two&quot;:range(2,11), &quot;three&quot;:range(3,12)}) 
    df[&quot;four&quot;] = (df.one * (y+z)) + (df.two * (y+z)) + (df.three * (y+z)) 
    return df 

fr = pd.DataFrame(columns=[&quot;a&quot;, &quot;b&quot;]) 

for aa in range(1, 5): 
    for bb in range(1, 5): 
        x = [aa,bb] 
        fr.loc[len(fr)] = x 

#for each value in cc is set DataFrame
fr[&quot;cc&quot;] = fr.apply(lambda x: qwe(x[&#39;a&#39;], x[&#39;b&#39;]), axis=1) 
print(fr.cc)
0        one  two  three  four
0    1    2      3   ...
1        one  two  three  four
0    1    2      3   ...
2        one  two  three  four
0    1    2      3   ...
3        one  two  three  four
0    1    2      3   ...
4        one  two  three  four
0    1    2      3   ...
5        one  two  three  four
0    1    2      3   ...
6        one  two  three  four
0    1    2      3   ...
7        one  two  three  four
0    1    2      3   ...
8        one  two  three  four
0    1    2      3   ...
9        one  two  three  four
0    1    2      3   ...
10       one  two  three  four
0    1    2      3   ...
11       one  two  three  four
0    1    2      3   ...
12       one  two  three  four
0    1    2      3   ...
13       one  two  three  four
0    1    2      3   ...
14       one  two  three  four
0    1    2      3   ...
15       one  two  three  four
0    1    2      3   ...
Name: cc, dtype: object

print(fr.cc.iloc[0])
   one  two  three  four
0    1    2      3    12
1    2    3      4    18
2    3    4      5    24
3    4    5      6    30
4    5    6      7    36
5    6    7      8    42
6    7    8      9    48
7    8    9     10    54
8    9   10     11    60

EDIT: If need join each DataFrame from fr.cc column use concat with list comprehension:

out = pd.concat([x for x in fr.cc])
print(out)
    one  two  three  four
0     1    2      3    12
1     2    3      4    18
2     3    4      5    24
3     4    5      6    30
4     5    6      7    36
..  ...  ...    ...   ...
4     5    6      7   144
5     6    7      8   168
6     7    8      9   192
7     8    9     10   216
8     9   10     11   240

[144 rows x 4 columns]

If mixed cc column - DataFrames with strings:

out = pd.concat([x for x in fr.cc if isinstance(x, pd.DataFrame)])
print(out)
    one  two  three  four
0     1    2      3    12
1     2    3      4    18
2     3    4      5    24
3     4    5      6    30
4     5    6      7    36
..  ...  ...    ...   ...
4     5    6      7   144
5     6    7      8   168
6     7    8      9   192
7     8    9     10   216
8     9   10     11   240

[144 rows x 4 columns]

huangapple
  • 本文由 发表于 2023年3月7日 17:55:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/75660412.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定