从类对象列表中删除重复项,基于字符串数组实例属性

huangapple go评论67阅读模式
英文:

Remove duplicates from List of class objects, based on string array instance property

问题

class RowData
{
    string Name;
    string[] Aliases;
}

List<RowData> rowData = new List<RowData>();
rowData.Add(new RowData { Name = "John Doe", Aliases = new string[] { "johndoe", "jdoe" } });
rowData.Add(new RowData { Name = "Jane Doe", Aliases = new string[] { "janedoe", "jdoe" } });
rowData.Add(new RowData { Name = "John Doe", Aliases = new string[] { "johndoe", "jdoe" } });

List<string[]> aliasList;

foreach(var row in in rowData)
{
    if(aliasList.Any(obj => obj.SequenceEqual(row.Aliases)))
    {
        continue;
    }
    aliasList.Add(row.Aliases);

    MakeRow(row.Name, row.Aliases);
}

最终结果应只包含列表的索引0和1处的项目,因为索引0和2处的项目的Aliases相同。我尝试了不同的方法来从类型为RowData的对象列表中筛选出Aliases相同的行,但这是我能做到的最好的方法。是否有更好的方法使用LINQ来完成这个任务?

英文:
class RowData
{
    string Name;
    string[] Aliases;
}

List<RowData> rowData = new List<RowData>();
rowData.Add(new RowData { Name = "John Doe", Aliases = new string[] { "johndoe", "jdoe" } });
rowData.Add(new RowData { Name = "Jane Doe", Aliases = new string[] { "janedoe", "jdoe" } });
rowData.Add(new RowData { Name = "John Doe", Aliases = new string[] { "johndoe", "jdoe" } });

List<string[]> aliasList;

foreach(var row in in rowData)
{
    if(aliasList.Any(obj => obj.SequenceEqual(row.Aliases)))
    {
        continue;
    }
    aliasList.Add(row.Aliases);

    MakeRow(row.Name, row.Aliases);
}

Final result should only contain item at index 0 & 1 of List, as Aliases of items at index 0 & 2 are equal

I’ve tried different ways of filtering out rows where Aliases are common from the list of objects of type RowData, but this is the best way I was able to do this. Is there any better way of doing this using LINQ?

答案1

得分: 3

我会使用自定义的 IEqualityComparer<RowData>,您可以在许多LINQ方法中使用它。在这种情况下,您可以将其用于 Distinct

public class RowDataAliasComparer : IEqualityComparer<RowData>
{
    public bool Equals(RowData x, RowData y)
    {
        if (x?.Aliases == null && y?.Aliases == null)
        {
            return true;
        }

        if (x?.Aliases == null || y?.Aliases == null)
        {
            return false;
        }

        return x.Aliases.OrderBy(a => a).SequenceEqual(y.Aliases.OrderBy(a => a));
    }

    public int GetHashCode(RowData obj)
    {
        if (obj?.Aliases?.Any() != true)
        {
            return int.MinValue;
        }

        unchecked
        {
            int hash = 0;
            foreach (string alias in obj.Aliases.OrderBy(a => a))
                hash ^= (alias?.GetHashCode() ?? 0);
            return hash;
        }
    }
}

因此,删除重复项的代码很简单:

rowData = rowData.Distinct(new RowDataAliasComparer()).ToList();
英文:

I would use a custom IEqualityComparer<RowData> which you can use for many LINQ methods. In this case you can use it for Distinct:

public class RowDataAliasComparer : IEqualityComparer<RowData>
{
    public bool Equals(RowData x, RowData y)
    {
        if (x?.Aliases == null && y?.Aliases == null)
        {
            return true;
        }

        if (x?.Aliases == null || y?.Aliases == null)
        {
            return false;
        }

        return x.Aliases.OrderBy(a => a).SequenceEqual(y.Aliases.OrderBy(a => a));
    }

    public int GetHashCode(RowData obj)
    {
        if (obj?.Aliases?.Any() != true)
        {
            return int.MinValue;
        }

        unchecked
        {
            int hash = 0;
            foreach (string alias in obj.Aliases.OrderBy(a => a))
                hash ^= (alias?.GetHashCode() ?? 0);
            return hash;
        }
    }
}

So the code to remove the duplicates is easy:

rowData = rowData.Distinct(new RowDataAliasComparer()).ToList();

答案2

得分: 2

另一种选择是使用 DistinctBy(),这可能会更清晰一些。这是一个对顺序敏感的实现。通过使用 OrderBy(),可以轻松使其对顺序不敏感。

public class StringArrayComparer : IEqualityComparer<IEnumerable<string>>
{
    public bool Equals(IEnumerable<string>? x, IEnumerable<string>? y)
        => Enumerable.SequenceEqual(x, y);

    public int GetHashCode(IEnumerable<string> obj)
    {
        // 复制了Tim Schmelter答案中的GetHashCode()方法
        if (obj.Any() != true)
        {
            return int.MinValue;
        }

        unchecked
        {
            int hash = 0;
            foreach (string alias in obj)
                hash ^= (alias?.GetHashCode() ?? 0);
            return hash;
        }
    }
}

以及用法

var filteredRows = rowData.DistinctBy(s => s.Aliases, new StringArrayComparer());

请注意,我已经将代码块中的HTML编码还原为正常的角括号以使代码易于阅读。

英文:

Another option is to use DistinctBy() that might be a bit cleaner. This is order-sensitive implementation. It's easy to make it insensitive by using OrderBy()

public class StringArrayComparer : IEqualityComparer&lt;IEnumerable&lt;string&gt;&gt;
{
    public bool Equals(IEnumerable&lt;string&gt;? x, IEnumerable&lt;string&gt;? y)
        =&gt; Enumerable.SequenceEqual(x, y);

    public int GetHashCode(IEnumerable&lt;string&gt; obj)
    {
        // copy of GetHashCode() method of Tim Schmelter&#39;s answer
        if (obj.Any() != true)
        {
            return int.MinValue;
        }

        unchecked
        {
            int hash = 0;
            foreach (string alias in obj)
                hash ^= (alias?.GetHashCode() ?? 0);
            return hash;
        }
    }
}

and usage

var filteredRows = rowData.DistinctBy(s =&gt; s.Aliases, new StringArrayComparer());

答案3

得分: 0

你可以使用linq和string.Join

var _result= rowData.GroupBy(d => 
          new { d.Name, v = string.Join("|", d.Aliases) 
              }).Select(g => g.First()).ToList();
英文:

You can use linq and string.Join

var _result= rowData.GroupBy(d =&gt; 
          new { d.Name, v = string.Join(&quot;|&quot;, d.Aliases) 
              }).Select(g =&gt; g.First()).ToList();

答案4

得分: 0

请尝试以下代码:

List<RowData> rowData = new List<RowData>();
rowData.Add(new RowData { Name = "John Doe", Aliases = new string[] { "johndoe", "jdoe" } });
rowData.Add(new RowData { Name = "Jane Doe", Aliases = new string[] { "janedoe", "jdoe" } });
rowData.Add(new RowData { Name = "John Doe", Aliases = new string[] { "johndoe", "jdoe" } });

List<RowData> distinctData = rowData
    .GroupBy(x => new { x.Name, x.Aliases })
    .Select(group => group.First())
    .ToList();

希望对你有所帮助。

英文:

Try this code

List&lt;RowData&gt; rowData = new List&lt;RowData&gt;();
rowData.Add(new RowData { Name = &quot;John Doe&quot;, Aliases = new string[] { &quot;johndoe&quot;, &quot;jdoe&quot; } });
rowData.Add(new RowData { Name = &quot;Jane Doe&quot;, Aliases = new string[] { &quot;janedoe&quot;, &quot;jdoe&quot; } });
rowData.Add(new RowData { Name = &quot;John Doe&quot;, Aliases = new string[] { &quot;johndoe&quot;, &quot;jdoe&quot; } });

List&lt;RowData&gt; distinctData = rowData
    .GroupBy(x =&gt; new { x.Name, x.Aliases })
    .Select(group =&gt; group.First())
    .ToList();

huangapple
  • 本文由 发表于 2023年6月26日 22:23:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76557606.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定