The Enumerable not taking the next batch, instead always taking the first 3 items.

huangapple go评论74阅读模式
英文:

Why is the Enumerable not taking the next batch, instead always taking the first 3 items

问题

我正在进行一个将字符串列表分成批次并异步处理每个批次的POC
但当我运行程序时,它总是取第一组项目(批次大小为3)。
所以请问有谁能帮我解决如何移动到列表的下一组项目。
`Take` 是我写的一个扩展方法。我尝试使用了 `async/await` 模式。

提前感谢

public class Program
{
    public static async Task Main(string[] args)
    {
        var obj = new Class1();
        List<string> fruits = new()
            {
                "1",
                "2",
                "3",
                "4",
                "5",
                "6",
                "7",
                "8",
                "9",
                "10"
            };
        
        await Class1.Start(fruits);
        Console.ReadLine();
    }
}

public class Class1
{
    private const int batchSize = 3;
    public static async Task Start(List<string> fruits)
    {
        if (fruits == null)
            return;

        var e = fruits.GetEnumerator();
        while (true)
        {    
            var batch = e.Take(3); // 总是取前3个项目,而不移动到列表的下一个项目
            if (batch.Count == 0)
            {
                break;
            }
            await StartProcessing(batch);
        }
    }

    public static async Task StartProcessing(List<string> batch)
    {
        await Parallel.ForEachAsync(batch, async (item, CancellationToken) =>
        {
            var list = new List<string>();
            await Task.Delay(1000);
            Console.WriteLine($"水果名称:{item}");
            list.Add(item);
        });
    }
}

*****Extension.cs*****

public static class Extensions
        {
            public static List<T> Take<T>(this IEnumerator<T> e, int num)
            {
                List<T> list = new List<T>(num);
                int taken = 0;
                while (taken < num && e.MoveNext())
                {
                    list.Add(e.Current);
                    taken++;
                }

                return list;
            }
}
英文:

I'm doing a POC to split a List of strings into batches and process each batch asynchronously.
But when I run the program, it always takes the first set of items (that's 3 as per the batch size). So could anyone please help me how to move to the next set of items.
Take is an extension method that I have written. And I tried using async/await pattern for it.

Thanks in advance

public class Program
{
public static async Task Main(string[] args)
{
var obj = new Class1();
List<string> fruits = new()
{
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9",
"10"
};
await Class1.Start(fruits);
Console.ReadLine();
}
}
public class Class1
{
private const int batchSize = 3;
public static async Task Start(List<string> fruits)
{
if (fruits == null)
return;
var e = fruits.GetEnumerator();
while (true)
{    
var batch = e.Take(3); // always taking the first 3 items and not moving to the next items of the list
if (batch.Count == 0)
{
break;
}
await StartProcessing(batch);
}
}
public static async Task StartProcessing(List<string> batch)
{
await Parallel.ForEachAsync(batch, async (item, CancellationToken) =>
{
var list = new List<string>();
await Task.Delay(1000);
Console.WriteLine($"Fruit Name: {item}");
list.Add(item);
});
}
}

Extension.cs

public static class Extensions
{
public static List<T> Take<T>(this IEnumerator<T> e, int num)
{
List<T> list = new List<T>(num);
int taken = 0;
while (taken < num && e.MoveNext())
{
list.Add(e.Current);
taken++;
}
return list;
}
}

答案1

得分: 6

List<T>.Enumerator 是一个结构体。因此,在你的 Take 扩展方法中修改的是枚举器的副本。以下是使用你的扩展方法的一个更简单的示例(fiddle):

using System;
using System.Collections.Generic;

public class Program
{
    public static void Main()
    {
        List<string> fruits = new() { "1", "2", "3", "4", "5", "6", "7", "8", "9", "10" };
        
        var e = fruits.GetEnumerator();
        var firstThree = e.Take(3);
        var nextThree = e.Take(3);
        
        // 打印出 1, 2, 3
        foreach (var x in firstThree)
            Console.WriteLine(x);

        // 也会打印出 1, 2, 3
        foreach (var x in nextThree)
            Console.WriteLine(x);
    }
}

public static class Extensions
{
    public static List<T> Take<T>(this IEnumerator<T> e, int num)
    {
        List<T> list = new List<T>(num);
        int taken = 0;
        while (taken < num && e.MoveNext())
        {
            list.Add(e.Current);
            taken++;
        }

        return list;
    }
}

你可以通过确保 e 包含一个装箱的枚举器来解决这个问题,将

var e = fruits.GetEnumerator();

替换为

IEnumerable<string> e = fruits.GetEnumerator();

(fiddle)

另外,新版本的 C# 允许你使用 ref 扩展方法,这将使你能够像这样做(fiddle):

var e = fruits.GetEnumerator();
	
// 由于某种原因,泛型类型推断在这里无法工作
var firstThree = e.Take<string, List<string>.Enumerator>(3);
var nextThree = e.Take<string, List<string>.Enumerator>(3);

...

public static class Extensions
{
    public static List<T> Take<T, TEnum>(ref this TEnum e, int num)
        where TEnum : struct, IEnumerator<T>
    {
        ...
    }
}

但是,老实说,你的代码之所以不能工作是因为枚举器并不是用来这样使用的。内置的 Enumerable.Take 方法适用于可枚举对象,而不是枚举器,这是在 .NET 中进行此类操作的惯用方式。

对于你的用例,Enumerable.Chunk 是最合适的内置方法。如果你想了解它如何为教育目的从头实现,请参考以下相关问题:

英文:

List<T>.Enumerator is a struct. Thus, a copy of your enumerator is modified in your Take extension method. Here is a simpler example using your extension method (fiddle):

using System;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<string> fruits = new() { "1", "2", "3", "4", "5", "6", "7", "8", "9", "10" };
var e = fruits.GetEnumerator();
var firstThree = e.Take(3);
var nextThree = e.Take(3);
// prints 1, 2, 3
foreach (var x in firstThree)
Console.WriteLine(x);
// also prints 1, 2, 3
foreach (var x in nextThree)
Console.WriteLine(x);
}
}
public static class Extensions
{
public static List<T> Take<T>(this IEnumerator<T> e, int num)
{
List<T> list = new List<T>(num);
int taken = 0;
while (taken < num && e.MoveNext())
{
list.Add(e.Current);
taken++;
}
return list;
}
}

You can fix this by making sure that e contains a boxed enumerator by replacing

var e = fruits.GetEnumerator();

with

IEnumerable<string> e = fruits.GetEnumerator();

(fiddle)


Alternatively, newer versions of C# allow you to use ref extension methods, which would enable you to do something like this (fiddle):

var e = fruits.GetEnumerator();
// For some reason generic type inference won't work here
var firstThree = e.Take<string, List<string>.Enumerator>(3);
var nextThree = e.Take<string, List<string>.Enumerator>(3);
...
public static class Extensions
{
public static List<T> Take<T, TEnum>(ref this TEnum e, int num)
where TEnum : struct, IEnumerator<T>
{
...
}
}

But, honestly, the real reason why your code does not work is because enumerators aren't meant to be used like this. The built-in Enumerable.Take method works on Enumerables, not on Enumerators, and that's the idiomatic way to do those things in .NET.

For your use case, Enumerable.Chunk is the most appropriate built-in method. If you want to see how it could be implemented from scratch for educational purposes, have a look at these related questions:

huangapple
  • 本文由 发表于 2023年4月19日 14:38:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76051418.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定