df.apply(hurst_function) 报错:必须是实数,而不是元组,在 Python 中。

huangapple go评论82阅读模式
英文:

df.apply(hurst_function) gave TypeError: must be real number, not tuple in, Python

问题

以下是您要翻译的部分:

"I have a column in form of a data-frame that contains the ratio of some numbers.
On that df col, I want to apply hurst function using df.apply() method.

I don't know if the error is with the df.apply or with the hurst_function.
Consider the code which calculates hurst exponent on a col using the df.apply method:

import hurst 

def hurst_function(df_col_slice):
    display(df_col_slice)
    return hurst.compute_Hc(df_col_slice)

def func(df_col):
    
    results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
    return results

func(df_col)

I get the error:

Input In [73], in func(df_col)
---> 32     results = round(df_col.rolling(101).apply(hurst_function)[100:],1)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1843, in Rolling.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   1822 @doc(
   1823     template_header,
   1824     create_section_header("Parameters"),
   (...)

... (中间部分省略)

TypeError: must be real number, not tuple

What can I do to solve this?

Edit: display(df_col_slice) is giving the following output:

0      0.282043
1      0.103355
2      0.537766
3      0.491976
4      0.535050
         ...   
96     0.022696
97     0.438995
98    -0.131486
99     0.248250
100    1.246463
Length: 101, dtype: float64

"

英文:

I have a column in form of a data-frame that contains the ratio of some numbers.
On that df col, I want to apply hurst function using df.apply() method.

I don't know if the error is with the df.apply or with the hurst_function.
Consider the code which calculates hurst exponent on a col using the df.apply method:

import hurst 

def hurst_function(df_col_slice):
    display(df_col_slice)
    return hurst.compute_Hc(df_col_slice)

def func(df_col):
    
    results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
    return results

func(df_col)

I get the error:

Input In [73], in func(df_col)
---> 32     results = round(df_col.rolling(101).apply(hurst_function)[100:],1)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1843, in Rolling.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   1822 @doc(
   1823     template_header,
   1824     create_section_header("Parameters"),
   (...)
   1841     kwargs: dict[str, Any] | None = None,
   1842 ):
-> 1843     return super().apply(
   1844         func,
   1845         raw=raw,
   1846         engine=engine,
   1847         engine_kwargs=engine_kwargs,
   1848         args=args,
   1849         kwargs=kwargs,
   1850     )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1315, in RollingAndExpandingMixin.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   1312 else:
   1313     raise ValueError("engine must be either 'numba' or 'cython'")
-> 1315 return self._apply(
   1316     apply_func,
   1317     numba_cache_key=numba_cache_key,
   1318     numba_args=numba_args,
   1319 )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:590, in BaseWindow._apply(self, func, name, numba_cache_key, numba_args, **kwargs)
    587     return result
    589 if self.method == "single":
--> 590     return self._apply_blockwise(homogeneous_func, name)
    591 else:
    592     return self._apply_tablewise(homogeneous_func, name)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:442, in BaseWindow._apply_blockwise(self, homogeneous_func, name)
    437 """
    438 Apply the given function to the DataFrame broken down into homogeneous
    439 sub-frames.
    440 """
    441 if self._selected_obj.ndim == 1:
--> 442     return self._apply_series(homogeneous_func, name)
    444 obj = self._create_data(self._selected_obj)
    445 if name == "count":
    446     # GH 12541: Special case for count where we support date-like types

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:431, in BaseWindow._apply_series(self, homogeneous_func, name)
    428 except (TypeError, NotImplementedError) as err:
    429     raise DataError("No numeric types to aggregate") from err
--> 431 result = homogeneous_func(values)
    432 return obj._constructor(result, index=obj.index, name=obj.name)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:582, in BaseWindow._apply.<locals>.homogeneous_func(values)
    579     return func(x, start, end, min_periods, *numba_args)
    581 with np.errstate(all="ignore"):
--> 582     result = calc(values)
    584 if numba_cache_key is not None:
    585     NUMBA_FUNC_CACHE[numba_cache_key] = func

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:579, in BaseWindow._apply.<locals>.homogeneous_func.<locals>.calc(x)
    571 start, end = window_indexer.get_window_bounds(
    572     num_values=len(x),
    573     min_periods=min_periods,
    574     center=self.center,
    575     closed=self.closed,
    576 )
    577 self._check_window_bounds(start, end, len(x))
--> 579 return func(x, start, end, min_periods, *numba_args)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1342, in RollingAndExpandingMixin._generate_cython_apply_func.<locals>.apply_func(values, begin, end, min_periods, raw)
   1339 if not raw:
   1340     # GH 45912
   1341     values = Series(values, index=self._on)
-> 1342 return window_func(values, begin, end, min_periods)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\_libs\window\aggregations.pyx:1315, in pandas._libs.window.aggregations.roll_apply()

TypeError: must be real number, not tuple

What can I do to solve this?

Edit: display(df_col_slice) is giving the following output:

0      0.282043
1      0.103355
2      0.537766
3      0.491976
4      0.535050
         ...   
96     0.022696
97     0.438995
98    -0.131486
99     0.248250
100    1.246463
Length: 101, dtype: float64

答案1

得分: 3

hurst.compute_Hc 函数返回一个包含 3 个值的元组:

H,c,vals = compute_Hc(df_col_slice)

其中,H 是赫斯特指数,而 c 是某个常数。

但是,pandas._libs.window.aggregations.roll_apply() 期望其参数(函数)返回一个单一的标量,它是滚动窗口的减小结果。

这就是为什么你的 hurst_function 函数需要从 vals 返回某个特定值。

英文:

hurst.compute_Hc function returns a tuple of 3 values:

H, c, vals = compute_Hc(df_col_slice)

where H is the Hurst exponent , and c - is some constant.

But, pandas._libs.window.aggregations.roll_apply() expects its argument (function) to return a single (scalar) which is the reduced result of a rolling window.

That's why your hurst_function function need to return a certain value from vals.

huangapple
  • 本文由 发表于 2023年2月6日 17:44:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75359622.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定