参考 polars.DataFrame.height 在 with_columns 中。

huangapple go评论117阅读模式
英文:

Reference polars.DataFrame.height in with_columns

问题

在这个例子中,numpy.random.randint(10, 99, 6) 中的 6 是硬编码的DataFrame的高度,所以如果我将间隔从 8h 更改为 4h(需要将 6 更改为 12),它将无法工作。

我知道可以通过中断链来实现:

  1. df = polars.DataFrame(dict(
  2. j=polars.date_range(datetime.date(2023, 1, 1), datetime.date(2023, 1, 3), '4h', closed='left', eager=True),
  3. ))
  4. df = df.with_columns(
  5. k=polars.lit(numpy.random.randint(10, 99, df.height)),
  6. )

是否有办法在一个链式表达式中实现(即引用 df.height 或等效的内容)?

英文:

Take this example:

  1. df = (polars
  2. .DataFrame(dict(
  3. j=polars.date_range(datetime.date(2023, 1, 1), datetime.date(2023, 1, 3), '8h', closed='left', eager=True),
  4. ))
  5. .with_columns(
  6. k=polars.lit(numpy.random.randint(10, 99, 6)),
  7. )
  8. )
  9. j k
  10. 2023-01-01 00:00:00 47
  11. 2023-01-01 08:00:00 22
  12. 2023-01-01 16:00:00 82
  13. 2023-01-02 00:00:00 19
  14. 2023-01-02 08:00:00 85
  15. 2023-01-02 16:00:00 15
  16. shape: (6, 2)

Here, numpy.random.randint(10, 99, 6) uses hard-coded 6 as the height of DataFrame, so it won't work if I changed e.g. the interval from 8h to 4h (which would require changing 6 to 12).

I know I can do it by breaking the chain:

  1. df = polars.DataFrame(dict(
  2. j=polars.date_range(datetime.date(2023, 1, 1), datetime.date(2023, 1, 3), '4h', closed='left', eager=True),
  3. ))
  4. df = df.with_columns(
  5. k=polars.lit(numpy.random.randint(10, 99, df.height)),
  6. )
  7. j k
  8. 2023-01-01 00:00:00 47
  9. 2023-01-01 04:00:00 22
  10. 2023-01-01 08:00:00 82
  11. 2023-01-01 12:00:00 19
  12. 2023-01-01 16:00:00 85
  13. 2023-01-01 20:00:00 15
  14. 2023-01-02 00:00:00 89
  15. 2023-01-02 04:00:00 74
  16. 2023-01-02 08:00:00 26
  17. 2023-01-02 12:00:00 11
  18. 2023-01-02 16:00:00 86
  19. 2023-01-02 20:00:00 81
  20. shape: (12, 2)

Is there a way to do it (i.e. reference df.height or an equivalent) in one chained expression though?

答案1

得分: 2

你可以使用 .pipe()

  1. (
  2. pl.date_range(
  3. datetime.date(2023, 1, 1),
  4. datetime.date(2023, 1, 3),
  5. '4h',
  6. closed='left',
  7. eager=True
  8. )
  9. .to_frame()
  10. .pipe(lambda df:
  11. df.with_columns(rand =
  12. pl.lit(np.random.randint(10, 99, df.height))
  13. )
  14. )
  15. )
  1. 形状: (12, 2)
  2. ┌─────────────────────┬──────┐
  3. date rand
  4. --- ---
  5. datetimes] i64
  6. ╞═════════════════════╪══════╡
  7. 2023-01-01 00:00:00 39
  8. 2023-01-01 04:00:00 45
  9. 2023-01-01 08:00:00 95
  10. 2023-01-01 12:00:00 72
  11. 2023-01-02 08:00:00 34
  12. 2023-01-02 12:00:00 42
  13. 2023-01-02 16:00:00 30
  14. 2023-01-02 20:00:00 83
  15. └─────────────────────┴──────┘
英文:

You can use .pipe()

  1. (
  2. pl.date_range(
  3. datetime.date(2023, 1, 1),
  4. datetime.date(2023, 1, 3),
  5. '4h',
  6. closed='left',
  7. eager=True
  8. )
  9. .to_frame()
  10. .pipe(lambda df:
  11. df.with_columns(rand =
  12. pl.lit(np.random.randint(10, 99, df.height))
  13. )
  14. )
  15. )
  1. shape: (12, 2)
  2. ┌─────────────────────┬──────┐
  3. date rand
  4. --- ---
  5. datetimes] i64
  6. ╞═════════════════════╪══════╡
  7. 2023-01-01 00:00:00 39
  8. 2023-01-01 04:00:00 45
  9. 2023-01-01 08:00:00 95
  10. 2023-01-01 12:00:00 72
  11. 2023-01-02 08:00:00 34
  12. 2023-01-02 12:00:00 42
  13. 2023-01-02 16:00:00 30
  14. 2023-01-02 20:00:00 83
  15. └─────────────────────┴──────┘

huangapple
  • 本文由 发表于 2023年6月12日 05:18:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76452551.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定