如何在C#中合并两个具有不同列和行的CSV文件?

huangapple go评论77阅读模式
英文:

how to merge two csv files with different columns and rows in c#

问题

col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa
,,,bb,vv
,,,m,n
英文:

I'm trying to merge two csv files which have different headers and different number of rows/lines.
Using the following code, but doesn't get correct output. It's working when rows are same.

var first = File.ReadAllLines("firstfile.csv");
var second = File.ReadAllLines("secondfile.csv");
var result = first.Zip(second, (f, s) => string.Join(",", f, s));
File.WriteAllLines("combined.csv", result);

for ex:
firstfile is

col1,colb,colc
a,b,c
a,v,f

the secondfile is

colx,coly
x,y
cc,aa
bb,vv
m,n

the output is get

col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa

the second file rows are missiing.
my expected output is

col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa
,,,bb,vv
,,,m,n

答案1

得分: 2

以下是您要的中文翻译部分:

没有内置方法允许您合并两个不等长的列表。Zip仅合并到最短长度。但是,您可以通过修改Marc Gravell的优秀答案这里来实现您想要的效果,以允许默认值。创建一个扩展类,类似于这样:

public static class Extensions
{
    public static IEnumerable<T> Merge<T>(this IEnumerable<T> first,
    IEnumerable<T> second, T defaultValue, Func<T, T, T> operation)
    {
        using (var iter1 = first.GetEnumerator())
        using (var iter2 = second.GetEnumerator())
        {
            while (iter1.MoveNext())
            {
                if (iter2.MoveNext())
                {
                    yield return operation(iter1.Current, iter2.Current);
                }
                else
                {
                    yield return operation(iter1.Current, defaultValue);
                }
            }
            while (iter2.MoveNext())
            {
                yield return operation(defaultValue, iter2.Current);
            }
        }
    }
}

现在您可以使用以下代码调用它:

char separator = ',';
var first = File.ReadAllLines("firstfile.csv").AsEnumerable();
var second = File.ReadAllLines("secondfile.csv").AsEnumerable();

string defaultValue = "";
int cnt = 0;
if (first.Count() < second.Count())
{
    cnt = first.FirstOrDefault().Split(separator).Length;
}
else
{
    cnt = second.FirstOrDefault().Split(separator).Length;
}
defaultValue = defaultValue.PadLeft(cnt - 1, separator);
var result = first.Merge(second, defaultValue, (f, s) => string.Join(separator.ToString(), f, s));
File.WriteAllLines("combined.csv", result);

请注意,我添加了一个char分隔符,并将ReadAllLines的结果更改为IEnumerable<string>,而不是string[],以使代码更通用。此外,上述代码假定两个文件都具有内部一致的列数。

英文:

There is no inbuilt method that allows you to merge two lists of unequal length. Zip only merges down to the shortest length. However, you can achieve what you want by modifying Marc Gravell's excellent answer here, in order to allow a default value. Create yourself an extensions class, something like this:

public static class Extensions
{
    public static IEnumerable&lt;T&gt; Merge&lt;T&gt;(this IEnumerable&lt;T&gt; first,
    IEnumerable&lt;T&gt; second, T defaultValue, Func&lt;T, T, T&gt; operation)
    {
        using (var iter1 = first.GetEnumerator())
        using (var iter2 = second.GetEnumerator())
        {
            while (iter1.MoveNext())
            {
                if (iter2.MoveNext())
                {
                    yield return operation(iter1.Current, iter2.Current);
                }
                else
                {
                    yield return operation(iter1.Current, defaultValue);
                }
            }
            while (iter2.MoveNext())
            {
                yield return operation(defaultValue, iter2.Current);
            }
        }
    }
}

You can now call it with code like this:

char separator = &#39;,&#39;;
var first = File.ReadAllLines(&quot;firstfile.csv&quot;).AsEnumerable();
var second = File.ReadAllLines(&quot;secondfile.csv&quot;).AsEnumerable();
            
string defaultValue = &quot;&quot;;
int cnt = 0;
if (first.Count() &lt; second.Count())
{
    cnt = first.FirstOrDefault().Split(separator).Length;
}
else
{
    cnt = second.FirstOrDefault().Split(separator).Length;
}
defaultValue = defaultValue.PadLeft(cnt - 1, separator);
var result = first.Merge(second, defaultValue, (f, s) =&gt; string.Join(separator.ToString(), f, s));
File.WriteAllLines(&quot;combined.csv&quot;, result);

Note I have added a char separator and changed the result of ReadAllLines to give an IEnumerable&lt;string&gt; rather than string[] to make the code more generic. Also the above code assumes that the both files have an internally consistent number of columns.

答案2

得分: 0

First you need to find out which of the two lists is the larger one so you can loop over that one and once you're past the length of the smaller list you can fill up the missing cells with empty values.

首先,你需要确定两个列表中哪个更大,这样你就可以循环遍历较大的列表,一旦超过较小列表的长度,就可以用空值填充缺失的单元格。

Next you need to know how many columns you have in the smaller list as you want to fill these columns with empty values. That means you have to take the header line of the smaller list, split it by comma and count the columns.

接下来,你需要知道较小列表中有多少列,因为你想要用空值填充这些列。这意味着你需要获取较小列表的标题行,按逗号分割它并计算列数。

Then generate a string containing your empty cells (eg. if your smaller list has 3 columns, you need a string ",,," - String Padding may be of help here).

然后生成一个包含空单元格的字符串(例如,如果你的较小列表有3列,你需要一个字符串",,," - 字符串填充可能会有所帮助)。

So then you only have to loop over the larger list and get the two corresponding rows (or use the empty one you generated earlier) and concatenate them with a comma and put them in a list.

然后,你只需要循环遍历较大的列表,获取两个相应的行(或使用之前生成的空行),并用逗号连接它们并放入一个列表中。

英文:

First you need to find out which of the two lists is the larger one so you can loop over that one and once you're past the length of the smaller list you can fill up the missing cells with empty values.

Next you need to know how many columns you have in the smaller list as you want to fill these columns with empty values. That means you have to take the header line of the smaller list, split it by comma and count the columns.

Then generate a string containing your empty cells (eg. if your smaller list has 3 columns, you need a string ",," - String Padding may be of help here).

So then you only have to loop over the larger list and get the two corresponding rows (or use the empty one you generated earlier) and concatenate them with a comma and put them in a list.

huangapple
  • 本文由 发表于 2020年1月3日 17:10:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/59575766.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定