df.apply(hurst_function) 报错:必须是实数,而不是元组,在 Python 中。

huangapple go评论110阅读模式
英文:

df.apply(hurst_function) gave TypeError: must be real number, not tuple in, Python

问题

以下是您要翻译的部分:

"I have a column in form of a data-frame that contains the ratio of some numbers.
On that df col, I want to apply hurst function using df.apply() method.

I don't know if the error is with the df.apply or with the hurst_function.
Consider the code which calculates hurst exponent on a col using the df.apply method:

  1. import hurst
  2. def hurst_function(df_col_slice):
  3. display(df_col_slice)
  4. return hurst.compute_Hc(df_col_slice)
  5. def func(df_col):
  6. results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
  7. return results
  8. func(df_col)

I get the error:

  1. Input In [73], in func(df_col)
  2. ---> 32 results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
  3. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1843, in Rolling.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
  4. 1822 @doc(
  5. 1823 template_header,
  6. 1824 create_section_header("Parameters"),
  7. (...)
  8. ... (中间部分省略)
  9. TypeError: must be real number, not tuple

What can I do to solve this?

Edit: display(df_col_slice) is giving the following output:

  1. 0 0.282043
  2. 1 0.103355
  3. 2 0.537766
  4. 3 0.491976
  5. 4 0.535050
  6. ...
  7. 96 0.022696
  8. 97 0.438995
  9. 98 -0.131486
  10. 99 0.248250
  11. 100 1.246463
  12. Length: 101, dtype: float64

"

英文:

I have a column in form of a data-frame that contains the ratio of some numbers.
On that df col, I want to apply hurst function using df.apply() method.

I don't know if the error is with the df.apply or with the hurst_function.
Consider the code which calculates hurst exponent on a col using the df.apply method:

  1. import hurst
  2. def hurst_function(df_col_slice):
  3. display(df_col_slice)
  4. return hurst.compute_Hc(df_col_slice)
  5. def func(df_col):
  6. results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
  7. return results
  8. func(df_col)

I get the error:

  1. Input In [73], in func(df_col)
  2. ---> 32 results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
  3. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1843, in Rolling.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
  4. 1822 @doc(
  5. 1823 template_header,
  6. 1824 create_section_header("Parameters"),
  7. (...)
  8. 1841 kwargs: dict[str, Any] | None = None,
  9. 1842 ):
  10. -> 1843 return super().apply(
  11. 1844 func,
  12. 1845 raw=raw,
  13. 1846 engine=engine,
  14. 1847 engine_kwargs=engine_kwargs,
  15. 1848 args=args,
  16. 1849 kwargs=kwargs,
  17. 1850 )
  18. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1315, in RollingAndExpandingMixin.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
  19. 1312 else:
  20. 1313 raise ValueError("engine must be either 'numba' or 'cython'")
  21. -> 1315 return self._apply(
  22. 1316 apply_func,
  23. 1317 numba_cache_key=numba_cache_key,
  24. 1318 numba_args=numba_args,
  25. 1319 )
  26. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:590, in BaseWindow._apply(self, func, name, numba_cache_key, numba_args, **kwargs)
  27. 587 return result
  28. 589 if self.method == "single":
  29. --> 590 return self._apply_blockwise(homogeneous_func, name)
  30. 591 else:
  31. 592 return self._apply_tablewise(homogeneous_func, name)
  32. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:442, in BaseWindow._apply_blockwise(self, homogeneous_func, name)
  33. 437 """
  34. 438 Apply the given function to the DataFrame broken down into homogeneous
  35. 439 sub-frames.
  36. 440 """
  37. 441 if self._selected_obj.ndim == 1:
  38. --> 442 return self._apply_series(homogeneous_func, name)
  39. 444 obj = self._create_data(self._selected_obj)
  40. 445 if name == "count":
  41. 446 # GH 12541: Special case for count where we support date-like types
  42. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:431, in BaseWindow._apply_series(self, homogeneous_func, name)
  43. 428 except (TypeError, NotImplementedError) as err:
  44. 429 raise DataError("No numeric types to aggregate") from err
  45. --> 431 result = homogeneous_func(values)
  46. 432 return obj._constructor(result, index=obj.index, name=obj.name)
  47. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:582, in BaseWindow._apply.<locals>.homogeneous_func(values)
  48. 579 return func(x, start, end, min_periods, *numba_args)
  49. 581 with np.errstate(all="ignore"):
  50. --> 582 result = calc(values)
  51. 584 if numba_cache_key is not None:
  52. 585 NUMBA_FUNC_CACHE[numba_cache_key] = func
  53. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:579, in BaseWindow._apply.<locals>.homogeneous_func.<locals>.calc(x)
  54. 571 start, end = window_indexer.get_window_bounds(
  55. 572 num_values=len(x),
  56. 573 min_periods=min_periods,
  57. 574 center=self.center,
  58. 575 closed=self.closed,
  59. 576 )
  60. 577 self._check_window_bounds(start, end, len(x))
  61. --> 579 return func(x, start, end, min_periods, *numba_args)
  62. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\rolling.py:1342, in RollingAndExpandingMixin._generate_cython_apply_func.<locals>.apply_func(values, begin, end, min_periods, raw)
  63. 1339 if not raw:
  64. 1340 # GH 45912
  65. 1341 values = Series(values, index=self._on)
  66. -> 1342 return window_func(values, begin, end, min_periods)
  67. File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\_libs\window\aggregations.pyx:1315, in pandas._libs.window.aggregations.roll_apply()
  68. TypeError: must be real number, not tuple

What can I do to solve this?

Edit: display(df_col_slice) is giving the following output:

  1. 0 0.282043
  2. 1 0.103355
  3. 2 0.537766
  4. 3 0.491976
  5. 4 0.535050
  6. ...
  7. 96 0.022696
  8. 97 0.438995
  9. 98 -0.131486
  10. 99 0.248250
  11. 100 1.246463
  12. Length: 101, dtype: float64

答案1

得分: 3

hurst.compute_Hc 函数返回一个包含 3 个值的元组:

H,c,vals = compute_Hc(df_col_slice)

其中,H 是赫斯特指数,而 c 是某个常数。

但是,pandas._libs.window.aggregations.roll_apply() 期望其参数(函数)返回一个单一的标量,它是滚动窗口的减小结果。

这就是为什么你的 hurst_function 函数需要从 vals 返回某个特定值。

英文:

hurst.compute_Hc function returns a tuple of 3 values:

  1. H, c, vals = compute_Hc(df_col_slice)

where H is the Hurst exponent , and c - is some constant.

But, pandas._libs.window.aggregations.roll_apply() expects its argument (function) to return a single (scalar) which is the reduced result of a rolling window.

That's why your hurst_function function need to return a certain value from vals.

huangapple
  • 本文由 发表于 2023年2月6日 17:44:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75359622.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定