CsvHelper在处理文本定界符和字段定界符时表现不同。

huangapple go评论65阅读模式
英文:

CsvHelper working differently with TextDelimiter & FieldDelimiter

问题

我们正在使用CsvHelper来处理csv和dat文件。对于dat文件的处理,我们保留了"TextDelimiter": "þ"和"FieldDelimiter": "¶"。

如果dat文件中没有双引号("),那么csvhelper正常工作。
如果dat文件中有双引号("),那么csvhelper会错误地拆分数据。

using (var reader = new StreamReader(datFilePath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
	if (LoadFileExtn.ToLower() == ".dat")
	{
		//csv.Configuration.IgnoreQuotes = true;
		csv.Configuration.Delimiter = "¶";
		csv.Configuration.Quote = "þ";
	}
	else
	{
		csv.Configuration.Delimiter = "|";
	}
	csv.Read();
	csv.ReadHeader();
	List<IDictionary<string, object>> dataRecords = csv.GetRecords<dynamic>()
				   .Select(x => (IDictionary<string, object>)x)
				   .ToList();
	foreach (var record in dataRecords)
	{

	}
}

LoadFile:

&#254;Internal_File_Id&#254;&#254;Id&#254;&#254;FileName&#254;&#254;EMAIL_FROM&#254;&#254;Subject&#254;&#254;EMAIL_RECEIVED_DATE_TIME&#254;
&#254;248073&#254;&#254;GRM00001504&#254;&#254;SCS CRUDE STRADDLES 11-19.msg&#254;&#254;AAA &lt;aaa@mail.com&gt;&#254;&#254;Test Mail 1&#254;&#254;2001-04-17 04:13:00&#254;
&#254;248074&#254;&#254;GRM00001505&#254;&#254;Please provide your NT Login Id and "_pc" Id RE: Vol Smil Authorization.msg&#254;&#254;AAA &lt;aaa@mail.com&gt;&#254;&#254;Please provide your NT Login Id and "_pc" Id RE: Vol Smil Authorization&#254;&#254;2001-04-17 04:13:00&#254;
英文:

We are using CsvHelper for processing csv & dat files. For the dat files processing, we are keeping the "TextDelimiter": "þ" & "FieldDelimiter": "¶"

If the dat file is NOT having any double quotes ('"'), then the csvhelper is working fine.
If the dat file is having a double quotes ('"'), then the csvhelper is splitting the data incorrectly.

using (var reader = new StreamReader(datFilePath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
	if (LoadFileExtn.ToLower() == &quot;.dat&quot;)
	{
		//csv.Configuration.IgnoreQuotes = true;
		csv.Configuration.Delimiter = &quot;&#182;&quot;;
		csv.Configuration.Quote = &quot;&#254;&quot;;
	}
	else
	{
		csv.Configuration.Delimiter = &quot;|&quot;;
	}
	csv.Read();
	csv.ReadHeader();
	List&lt;IDictionary&lt;string, object&gt;&gt; dataRecords = csv.GetRecords&lt;dynamic&gt;()
				   .Select(x =&gt; (IDictionary&lt;string, object&gt;)x)
				   .ToList();
	foreach (var record in dataRecords)
	{
	
	}
}

CsvHelper在处理文本定界符和字段定界符时表现不同。

CsvHelper在处理文本定界符和字段定界符时表现不同。

LoadFile:

&#254;Internal_File_Id&#254;&#254;Id&#254;&#254;FileName&#254;&#254;EMAIL_FROM&#254;&#254;Subject&#254;&#254;EMAIL_RECEIVED_DATE_TIME&#254;
&#254;248073&#254;&#254;GRM00001504&#254;&#254;SCS CRUDE STRADDLES 11-19.msg&#254;&#254;AAA &lt;aaa@mail.com&gt;&#254;&#254;Test Mail 1&#254;&#254;2001-04-17 04:13:00&#254;
&#254;248074&#254;&#254;GRM00001505&#254;&#254;Please provide your NT Login Id and &quot;_pc&quot; Id RE: Vol Smil Authorization.msg&#254;&#254;AAA &lt;aaa@mail.com&gt;&#254;&#254;Please provide your NT Login Id and &quot;_pc&quot; Id RE: Vol Smil Authorization&#254;&#254;2001-04-17 04:13:00&#254;

答案1

得分: 1

我相信这是因为双引号也默认作为转义字符。我将转义字符更改为与您的引号字符相同,看起来对我来说正常工作。

此外,看起来您正在使用较旧版本的 CsvHelper。我提供的示例是针对更新版本的,但您应该能够在您的代码中添加 csv.Configuration.Escape = '&#254;';

var config = new CsvConfiguration(CultureInfo.InvariantCulture){
	Delimiter = '¶',
	Quote = '&#254;',
	Escape = '&#254;'
};

using (var reader = new StreamReader(@"C:\Users\dspecht\Downloads\SampleDatFile_20230417_2.dat"))
using (var csv = new CsvReader(reader, config))
{
	csv.Read();
	csv.ReadHeader();
	List<IDictionary<string, object>> dataRecords = csv.GetRecords<dynamic>()
				   .Select(x => (IDictionary<string, object>)x)
				   .ToList();
	foreach (var record in dataRecords)
	{
		Console.WriteLine(record);
	}
}
英文:

I believe it is because the double quote also defaults as the escape character. I changed the escape character to be the same as your quote character and it appears to be working correctly for me.

Also it looks like you are using an older version of CsvHelper. The example I'm providing is for the newer version, but you should be able to add csv.Configuration.Escape = &#39;&#254;&#39;; to your code.

var config = new CsvConfiguration(CultureInfo.InvariantCulture){
	Delimiter = &quot;&#182;&quot;,
	Quote = &#39;&#254;&#39;,
	Escape = &#39;&#254;&#39;
};

using (var reader = new StreamReader(@&quot;C:\Users\dspecht\Downloads\SampleDatFile_20230417_2.dat&quot;))
using (var csv = new CsvReader(reader, config))
{
	csv.Read();
	csv.ReadHeader();
	List&lt;IDictionary&lt;string, object&gt;&gt; dataRecords = csv.GetRecords&lt;dynamic&gt;()
				   .Select(x =&gt; (IDictionary&lt;string, object&gt;)x)
				   .ToList();
	foreach (var record in dataRecords)
	{
		Console.WriteLine(record);
	}
}

huangapple
  • 本文由 发表于 2023年4月17日 15:19:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76032565.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定