使用Rust中的Polars进行筛选 – 急切地

huangapple go评论60阅读模式
英文:

Filtering with polars in Rust - Eagerly

问题

我试图在Rust中使用polars进行简单的过滤:

let mask = df.column("AISLE_ID").unwrap().eq(lit(1));
let filtered_df = df.filter(&mask).unwrap();

但这根本不起作用:期望的是 &ChunkedArray<...>,实际找到的是 &bool

我可以使用延迟方式来做,但我不想克隆数据框:

let dfe = df.clone();
let filtered_df = dfe.lazy().filter(
    col("AISLE_ID").eq(lit(1))
)
.collect();

你能帮助我吗?

英文:

I'm trying to do a simple filter with polars in rust :

let mask = df.column(&quot;AISLE_ID&quot;).unwrap().eq(lit(1));
let filtered_df = df.filter(&amp;mask).unwrap();

But it's not working at all : expected &amp;ChunkedArray&lt;...&gt;, found &amp;bool

I can do it with lazy way but I don't want to clone dataframe

let dfe = df.clone();
    let filtered_df = dfe.lazy().filter(
        col(&quot;AISLE_ID&quot;).eq(lit(1))
    )
    .collect();

Can you help me ?

答案1

得分: 1

正如其他人提到的,无论是惰性还是非惰性,每当执行filter操作时,都会执行数据的复制,从而创建一个新的DataFrame。不同之处在于复制何时执行(以及如果在惰性DataFrame的范围内发生多个转换时的优化)。

在您的原始惰性示例中,初始的let def = df.copy()是不必要的。以下代码编译并按预期工作:

use polars::prelude::*;

fn main() {
    let s0 = Series::new("AISLE_ID", [0, 1, 2].as_ref());
    let s1 = Series::new("temp", [22.1, 19.9, 7.].as_ref());
    let df = DataFrame::new(vec![s0, s1]).unwrap();
    let filtered_df = df.lazy().filter(
        col("AISLE_ID").eq(lit(1))
    )
    .collect();

    println!("{:?}", filtered_df)
}

返回:

Ok(shape: (1, 2)
┌──────────┬──────┐
│ AISLE_ID ┆ temp │
│ ---      ┆ ---  │
│ i32      ┆ f64  │
╞══════════╪══════╡
│ 1        ┆ 19.9 │
└──────────┴──────┘)

Cargo.toml:

[dependencies]
polars = { version = "0.29.0", features = ["lazy"] }

注意:我已经将代码中的HTML编码字符(例如&quot;)还原为正常的双引号字符(")以进行翻译。

英文:

As was mentioned by others, whenever you do a filter, whether it's lazy or not, a copy of the data is performed as a new DataFrame is created. The difference is when the copy is performed (along with optimizations if multiple transformations happen in the scope of the lazy DataFrame).

In your original lazy example, the initial let def = df.copy() is not necessary. The following code compiles and works as expected:

use polars::prelude::*;

fn main() {
	let s0 = Series::new(&quot;AISLE_ID&quot;, [0, 1, 2].as_ref());
	let s1 = Series::new(&quot;temp&quot;, [22.1, 19.9, 7.].as_ref());
	let df = DataFrame::new(vec![s0, s1]).unwrap();
	let filtered_df = df.lazy().filter(
		col(&quot;AISLE_ID&quot;).eq(lit(1))
	)
	.collect();

	println!(&quot;{:?}&quot;, filtered_df)
}

Returning:

Ok(shape: (1, 2)
┌──────────┬──────┐
│ AISLE_ID ┆ temp │
│ ---      ┆ ---  │
│ i32      ┆ f64  │
╞══════════╪══════╡
│ 1        ┆ 19.9 │
└──────────┴──────┘)

Cargo.toml:

[dependencies]
polars = { version = &quot;0.29.0&quot;, features = [&quot;lazy&quot;] }

huangapple
  • 本文由 发表于 2023年5月28日 18:29:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76351030.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定