pandas将一组API函数应用于多个数据框。

huangapple go评论83阅读模式
英文:

pandas apply a list of api functions to many dataframes

问题

以下是您提供的内容的翻译部分:

我想将一系列 pandas.api 函数应用于多个数据框并获取它们的响应。
我的代码:

panda_apis = [
pd.api.types.infer_dtype,
pd.api.types.is_bool_dtype,
pd.api.types.is_categorical_dtype,
pd.api.types.is_complex_dtype,
pd.api.types.is_datetime64_any_dtype,
pd.api.types.is_datetime64_dtype]

api_df = pd.DataFrame(data={'function':[i.__name__ for i in panda_apis]})
api_df.head()
# 两个示例数据框

df1 = 

               Datetime 	value
0 	2002-12-31 01:00:00 	5077.0
1 	2002-12-31 02:00:00 	4939.0
2 	2002-12-31 03:00:00 	4885.0
3 	2002-12-31 04:00:00 	4857.0

df2 = 
          Datetime 	    value
0 	2013-12-31 01:00:00 	1861.0
1 	2013-12-31 02:00:00 	1835.0
2 	2013-12-31 03:00:00 	1841.0
3 	2013-12-31 04:00:00 	1872.0
4 	2013-12-31 05:00:00 	1934.0
df1 = df1.addprefix('df1_')
api_df[df1.columns] = api_df['function'].apply(lambda x: eval(x(df1)))

目前的输出

TypeError: 'str' object is callable

期望的输出

```python
api_df = 
    function                df1_Datetime  df1_value   df2_Datetime  df2_value
0 	infer_dtype                  string     float  ....
1 	is_bool_dtype                False      False  ..
2 	is_categorical_dtype         False      False  ..
3 	is_complex_dtype             False      False  .. 
4 	is_datetime64_any_dtype      True       False  ..

请注意,由于代码的复杂性,可能需要进一步的调试以获得期望的输出。

<details>
<summary>英文:</summary>

I want apply a list of [`pandas.api`][1] functions to many dataframes and get their response. 
My code: 

    panda_apis = [
    pd.api.types.infer_dtype,
    pd.api.types.is_bool_dtype,
    pd.api.types.is_categorical_dtype,
    pd.api.types.is_complex_dtype,
    pd.api.types.is_datetime64_any_dtype,
    pd.api.types.is_datetime64_dtype]
    
    api_df = pd.DataFrame(data={&#39;function&#39;:[i.__name__ for i in panda_apis]})
    api_df.head()
    
        function
    0 	infer_dtype
    1 	is_bool_dtype
    2 	is_categorical_dtype
    3 	is_complex_dtype
    4 	is_datetime64_any_dtype
    
    # Two example dataframes
    
    df1 = 
    
                   Datetime 	value
    0 	2002-12-31 01:00:00 	5077.0
    1 	2002-12-31 02:00:00 	4939.0
    2 	2002-12-31 03:00:00 	4885.0
    3 	2002-12-31 04:00:00 	4857.0
    
    df2 = 
    	          Datetime 	    value
    0 	2013-12-31 01:00:00 	1861.0
    1 	2013-12-31 02:00:00 	1835.0
    2 	2013-12-31 03:00:00 	1841.0
    3 	2013-12-31 04:00:00 	1872.0
    4 	2013-12-31 05:00:00 	1934.0
    df1 = df1.addprefix(&#39;df1_&#39;)
    api_df[df1.columns] = api_df[&#39;function&#39;].apply(lambda x: eval(x(df1)))

Present output: 

    TypeError: &#39;str&#39; object is not callable

Expected output:

    api_df = 
        function                df1_Datetime  df1_value   df2_Datetime  df2_value
    0 	infer_dtype                  string     float  ....
    1 	is_bool_dtype                False      False  ..
    2 	is_categorical_dtype         False      False  ..
    3 	is_complex_dtype             False      False  .. 
    4 	is_datetime64_any_dtype      True       False  .. 


  [1]: https://pandas.pydata.org/docs/reference/api/pandas.api.types.pandas_dtype.html

</details>


# 答案1
**得分**: 1

lst = [df1.add_prefix("df1_"), df2.add_prefix("df2_")]

out = (
    api_df.join(
        pd.concat([pd.DataFrame([
            df.apply(f) for f in panda_apis]) for df in lst], axis=1)
    )
)

*NB : A DataFrame column can't have more than one [dtype][1]*.

Output :

print(out)

                      function df1_Datetime df1_value df2_Datetime df2_value
    0              infer_dtype       string  floating       string  floating
    1            is_bool_dtype        False     False        False     False
    2     is_categorical_dtype        False     False        False     False
    3         is_complex_dtype        False     False        False     False
    4  is_datetime64_any_dtype        False     False        False     False
    5      is_datetime64_dtype        False     False        False     False

With the `Datetime` *s* cast to `datetime64[ns]`, we get :

                      function df1_Datetime df1_value df2_Datetime df2_value
    0              infer_dtype   datetime64  floating   datetime64  floating
    1            is_bool_dtype        False     False        False     False
    2     is_categorical_dtype        False     False        False     False
    3         is_complex_dtype        False     False        False     False
    4  is_datetime64_any_dtype         True     False         True     False
    5      is_datetime64_dtype         True     False         True     False

  [1]: https://pandas.pydata.org/docs/user_guide/basics.html#dtypes

<details>
<summary>英文:</summary>

You can try this :

    lst = [df1.add_prefix(&quot;df1_&quot;), df2.add_prefix(&quot;df2_&quot;)]
    
    out = (
        api_df.join(
            pd.concat([pd.DataFrame([
                df.apply(f) for f in panda_apis]) for df in lst], axis=1)
        )
    )

*NB : A DataFrame column can&#39;t have more than one [dtype][1]*.

Output :

    print(out)
    
                      function df1_Datetime df1_value df2_Datetime df2_value
    0              infer_dtype       string  floating       string  floating
    1            is_bool_dtype        False     False        False     False
    2     is_categorical_dtype        False     False        False     False
    3         is_complex_dtype        False     False        False     False
    4  is_datetime64_any_dtype        False     False        False     False
    5      is_datetime64_dtype        False     False        False     False

With the `Datetime` *s* cast to `datetime64[ns]`, we get :

                      function df1_Datetime df1_value df2_Datetime df2_value
    0              infer_dtype   datetime64  floating   datetime64  floating
    1            is_bool_dtype        False     False        False     False
    2     is_categorical_dtype        False     False        False     False
    3         is_complex_dtype        False     False        False     False
    4  is_datetime64_any_dtype         True     False         True     False
    5      is_datetime64_dtype         True     False         True     False

  [1]: https://pandas.pydata.org/docs/user_guide/basics.html#dtypes


</details>



huangapple
  • 本文由 发表于 2023年6月13日 06:14:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76460640.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定