如何在使用高阶函数筛选一组项目时避免分配?

huangapple go评论59阅读模式
英文:

How can I avoid allocation when filtering on a set of items with a higher order function?

问题

尝试根据在运行时计算的泛型谓词来筛选一组项目:

fn main () {
    let el = vec![
        vec![10, 20, 30], 
        vec![40, 50, 60]
    ];
    
    println!("{:?}", use_predicate(el, a_predicate)); // [[40, 50, 60]]
}

fn a_predicate(it: &mut impl Iterator<Item = u64>) -> bool {
    it.any(|n| n > 50)
}

pub fn use_predicate<I: Iterator<Item = u64>>(
    items: Vec<Vec<u32>, 
    predicate: impl FnMut(&mut I) -> bool
) -> Vec<Vec<u32> {
    items
        .into_iter()
        // ISSUE HERE
        //
        // If I collect and rewrite predicate as predicat(Vec<u64>), it works,
        // however I want to avoid allocation just to iterate over the elements.
        //
        // It also works if I just call `a_predicate` directly, but I need to 
        // pass the predicate as a parameter...
        //
        // In other words, the error doesn't happen unless I have nested generics.
        .filter(|it| predicate(&mut it.iter().map(|it| *it as u64)))
        .collect::<Vec<_>>()
}

编译错误表明 rustc 无法将 u64 迭代器视为 impl Iterator<Item = u64>

error[E0308]: mismatched types
  --> src/main.rs:25:32
   |
10 | pub fn use_predicate<I: Iterator<Item = u64>>(
   |                      - this type parameter
...
25 |         .filter(|it| predicate(&mut it.iter().map(|it| *it as u64)))
   |                      --------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter `I`, found struct `Map`
   |                      |
   |                      arguments to this function are incorrect
   |
   = note: expected mutable reference `&mut I`
              found mutable reference `&mut Map<std::slice::Iter<_, u32>, [closure@src/main.rs:25:51: 25:55]>`

编辑:正如下面建议的使用 dyn,不是不可接受的解决方案,因为我确信仍然比收集到堆分配的 Vec<_> 快得多。然而,在生产中,我面临着一个难以解决的生命周期问题,如果没有将 predicate: FnMut 包装在 Mutex 中并在每次需要评估谓词时解锁。在这一点上,我对性能前景并不那么乐观。

我制作了另一个 playground,以更好地表示这个问题。这里的问题是通过包装在 Arc<Mutex<_>> 中来解决的。您是否能想到一种在不使用 Mutex 的情况下解决这个问题的方法?

英文:

Trying to filter a set of items, based on a predicate that is generic, because computed at run-time:

fn main () {
    let el = vec![
        vec![10, 20, 30], 
        vec![40, 50, 60]
    ];
    
    println!(&quot;{:?}&quot;, use_predicate(el, a_predicate)); // [[40, 50, 60]]
}

fn a_predicate(it: &amp;mut impl Iterator&lt;Item = u64&gt;) -&gt; bool {
    it.any(|n| n &gt; 50)
}

pub fn use_predicate&lt;I: Iterator&lt;Item = u64&gt;&gt;(
    items: Vec&lt;Vec&lt;u32&gt;&gt;, 
    predicate: impl FnMut(&amp;mut I) -&gt; bool
) -&gt; Vec&lt;Vec&lt;u32&gt;&gt; {
    items
        .into_iter()
        // ISSUE HERE
        //
        // If I collect and rewrite predicate as predicat(Vec&lt;u64&gt;), it works,
        // however I want to avoid allocation just to iterate over the elements.
        //
        // It also works if I just call `a_predicate` directly, but I need to 
        // pass the predicate as a parameter...
        //
        // In other words, the error doesn&#39;t happen unless I have nested generics.
        .filter(|it| predicate(&amp;mut it.iter().map(|it| *it as u64)))
        .collect::&lt;Vec&lt;_&gt;&gt;()
}

The compilation error indicate that rustc is not able to consider the u64 iterator as a impl Iterator&lt;Item = u64&gt;.

error[E0308]: mismatched types
  --&gt; src/main.rs:25:32
   |
10 | pub fn use_predicate&lt;I: Iterator&lt;Item = u64&gt;&gt;(
   |                      - this type parameter
...
25 |         .filter(|it| predicate(&amp;mut it.iter().map(|it| *it as u64)))
   |                      --------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter `I`, found struct `Map`
   |                      |
   |                      arguments to this function are incorrect
   |
   = note: expected mutable reference `&amp;mut I`
              found mutable reference `&amp;mut Map&lt;std::slice::Iter&lt;&#39;_, u32&gt;, [closure@src/main.rs:25:51: 25:55]&gt;`

I am not sure how to restrict the type while avoiding to collect into a Vec&lt;_&gt;, which is useless cost, since we only need to iterate over the items in order to filter.

Link to the playground


EDIT: Using dyn as suggested below is not an undesirable solution, as I'm sure is still much faster than collecting into a heap allocated Vec&lt;_&gt;. However, in production, I am facing a lifetime issue that is difficult to solve without wrapping the predicate: FnMut in a Mutex, and unlocking each time a predicate must be evaluated. And at this point, I am not so optimistic about the performance outlook.

I made another playground to better represent the issue. Here's the issue being solved by wrapping in an Arc&lt;Mutex&lt;_&gt;&gt;.
Can you think of a way to solve this without making use of the Mutex?

答案1

得分: 1

很抱歉,你的请求超出了我目前的功能范围,因为你要求只返回翻译的部分,不提供其他回应。如果你有其他需要翻译的内容,请随时提出。

英文:

Unfortunately, you can't express this type in Rust.

The first thing to understand is that generics are type arguments. That is, if a function uses a generic, the type of the generic is another input to the function, but at compile time. That's why I prefer the terminology "type parameter" rather than generic, but somehow it hasn't caught on much outside of functional programming languages.

So, the type parameter I is an input to use_predicate and gets determined at the call site. But inside the function you pass a very specific type of iterator to the predicate - one which the caller definitely did not provide. That's what expected type parameter &#39;I&#39;, found struct &#39;Map&#39; is saying.

Perhaps you also tried:

pub fn use_predicate&lt;P&gt;(
    items: Vec&lt;Vec&lt;u32&gt;&gt;, 
    predicate: impl FnMut(&amp;mut impl Iterator&lt;Item = u64&gt;) -&gt; bool
) -&gt; Vec&lt;Vec&lt;u32&gt;&gt; { ... }

This isn't allowed. The only way that this could currently be desugared in Rust is to essentially the same thing as you gave in your question. It would have the same problem, but also probably isn't the way that the Rust developers would want to desugar that. More likely, they'd want to desugar it using higher-ranked bounds:

pub fn use_predicate&lt;P&gt;(items: Vec&lt;Vec&lt;u32&gt;&gt;, predicate: P) -&gt; Vec&lt;Vec&lt;u32&gt;&gt;
where
    P: for&lt;I: Iterator&lt;Item = u64&gt;&gt; FnMut(&amp;mut I) -&gt; bool { ... }

Now this is exactly what you want! Sadly, this isn't currently supported, and I haven't seen very much enthusiasm for such a feature or any movement towards implementing it.


So what can you do?

You may be able to simplify your requirement. But, assuming you can't, you can use dynamic dispatch:

fn a_predicate(mut items: &amp;mut dyn Iterator&lt;Item = u64&gt;) -&gt; bool {
    // Clunky workaround because Iterator::any isn&#39;t object-safe
    (&amp;mut items).any(|n| n &gt; 50)
}

pub fn use_predicate(
    items: Vec&lt;Vec&lt;u32&gt;&gt;,
    mut predicate: impl FnMut(&amp;mut dyn Iterator&lt;Item = u64&gt;) -&gt; bool,
) -&gt; Vec&lt;Vec&lt;u32&gt;&gt; {
    items
        .into_iter()
        .filter(|items| predicate(&amp;mut items.iter().map(|&amp;n| n as u64)))
        .collect::&lt;Vec&lt;_&gt;&gt;()
}

huangapple
  • 本文由 发表于 2023年4月10日 20:21:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75977081.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定