英文:
Serializing Polars expressions as JSON or YAML file?
问题
我对 Polars 表达式语法非常满意,以至于我的很多特征工程都用 Polars 表达式来表达。
但是,我现在正在尝试将特征工程移到 JSON 或 YAML 文件中(出于 MLOps 的原因)。
问题是 - 我该如何将其编码为 JSON 文件:
英文:
I am extremely happy with the polars expression syntax, so much so that a lot of my feature engineering is expressed in polars expressions.
However, I am now trying to move the feature engineering to JSON or YAML files (for MLOps reasons).
The question is - how could I encode this as a JSON file:
configuration = {
'features': [
pl.col('col1').fill_null(0).log().le(0.2).alias('feature1'),
pl.col('col2').fill_null(0).log().le(0.2).alias('feature2'),
pl.col('col3').fill_null(0).log().le(0.2).alias('feature3')
],
'filters': [
pl.col('col4') >= 500_000,
pl.col('col5').is_in(['A', 'B'])
]
}
# This is how I use it - just for context
X = (df
.filter(pl.all(configuration['filters']))
.select(configuration['features'])
)
Any ideas on how I could serialize (or re-write) this as JSON such that it could be converted back to Polars expressions?
Note that this question has a lot of overlap with https://stackoverflow.com/questions/74976313/possible-to-stringize-a-polars-expression, but it's not a duplicate.
答案1
得分: 4
从polars >= 0.18.1
开始,我们直接支持将表达式序列化为JSON并从JSON反序列化。
def test_expression_json() -> None:
# 创建一个表达式
e = pl.col("foo").sum().over("bar")
# 序列化为JSON
json = e.meta.write_json()
# 从JSON反序列化回表达式
round_tripped = pl.Expr.from_json(json)
# 断言表达式相等性
assert round_tripped.meta == e
英文:
As of polars >= 0.18.1
we directly support serializing/deserializing expressions to and from json.
def test_expression_json() -> None:
# create an expression
e = pl.col("foo").sum().over("bar")
# serialize to json
json = e.meta.write_json()
# deserialize back to an expression
round_tripped = pl.Expr.from_json(json)
# assert expression equality
assert round_tripped.meta == e
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论