英文:
Filtering with polars in Rust - Eagerly
问题
我试图在Rust中使用polars进行简单的过滤:
let mask = df.column("AISLE_ID").unwrap().eq(lit(1));
let filtered_df = df.filter(&mask).unwrap();
但这根本不起作用:期望的是 &ChunkedArray<...>
,实际找到的是 &bool
。
我可以使用延迟方式来做,但我不想克隆数据框:
let dfe = df.clone();
let filtered_df = dfe.lazy().filter(
col("AISLE_ID").eq(lit(1))
)
.collect();
你能帮助我吗?
英文:
I'm trying to do a simple filter with polars in rust :
let mask = df.column("AISLE_ID").unwrap().eq(lit(1));
let filtered_df = df.filter(&mask).unwrap();
But it's not working at all : expected &ChunkedArray<...>
, found &bool
I can do it with lazy way but I don't want to clone dataframe
let dfe = df.clone();
let filtered_df = dfe.lazy().filter(
col("AISLE_ID").eq(lit(1))
)
.collect();
Can you help me ?
答案1
得分: 1
正如其他人提到的,无论是惰性还是非惰性,每当执行filter
操作时,都会执行数据的复制,从而创建一个新的DataFrame。不同之处在于复制何时执行(以及如果在惰性DataFrame的范围内发生多个转换时的优化)。
在您的原始惰性示例中,初始的let def = df.copy()
是不必要的。以下代码编译并按预期工作:
use polars::prelude::*;
fn main() {
let s0 = Series::new("AISLE_ID", [0, 1, 2].as_ref());
let s1 = Series::new("temp", [22.1, 19.9, 7.].as_ref());
let df = DataFrame::new(vec![s0, s1]).unwrap();
let filtered_df = df.lazy().filter(
col("AISLE_ID").eq(lit(1))
)
.collect();
println!("{:?}", filtered_df)
}
返回:
Ok(shape: (1, 2)
┌──────────┬──────┐
│ AISLE_ID ┆ temp │
│ --- ┆ --- │
│ i32 ┆ f64 │
╞══════════╪══════╡
│ 1 ┆ 19.9 │
└──────────┴──────┘)
Cargo.toml:
[dependencies]
polars = { version = "0.29.0", features = ["lazy"] }
注意:我已经将代码中的HTML编码字符(例如"
)还原为正常的双引号字符("
)以进行翻译。
英文:
As was mentioned by others, whenever you do a filter
, whether it's lazy or not, a copy of the data is performed as a new DataFrame is created. The difference is when the copy is performed (along with optimizations if multiple transformations happen in the scope of the lazy DataFrame).
In your original lazy example, the initial let def = df.copy()
is not necessary. The following code compiles and works as expected:
use polars::prelude::*;
fn main() {
let s0 = Series::new("AISLE_ID", [0, 1, 2].as_ref());
let s1 = Series::new("temp", [22.1, 19.9, 7.].as_ref());
let df = DataFrame::new(vec![s0, s1]).unwrap();
let filtered_df = df.lazy().filter(
col("AISLE_ID").eq(lit(1))
)
.collect();
println!("{:?}", filtered_df)
}
Returning:
Ok(shape: (1, 2)
┌──────────┬──────┐
│ AISLE_ID ┆ temp │
│ --- ┆ --- │
│ i32 ┆ f64 │
╞══════════╪══════╡
│ 1 ┆ 19.9 │
└──────────┴──────┘)
Cargo.toml:
[dependencies]
polars = { version = "0.29.0", features = ["lazy"] }
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论