问题

对于我们当前的项目，我正在使用CSVHelper Nuget，一切都运行得很完美，唯一的例外是当字段包含特殊字符（ä，ü，...）时。我该如何更改它以使其正常工作，而不显示？作为字母替换？（我尝试过Current和Invariant Culture，但没有用）。

我尝试更改在从文件读取字节流时的Culture，也尝试在解析CSV时使用不同的Culture。

英文:

For our current project, i am using the CSVHelper Nuget and everything works perfectly with it with the only exception when the field contains special characters (ä,ü,...). How can I change it to make it work and not show ? as the letter replacement? (I tried Current and Invariant Culture but it didn't matter).

I tried changing the Culture when reading the byte stream from the file and I tried using different Cultures when parsing the CSV.

答案1

得分: 1

我经常遇到这样的问题，当有人将Excel文件保存为 CSV (逗号分隔)(*.csv) 而不是 CSV UTF-8 (逗号分隔)(*.csv) 时。这往往意味着根据保存的国家/地区，它通常被保存为 Windows 1252 编码。在大多数情况下，你可以在使用 StreamReader 读取文件时使用 ISO-8859-1 编码，也被称为 Latin-1 编码。如果仍然有一些字符无法正确读取，你可能需要使用保存文件时使用的确切编码。

ISO-8859-1（也称为 Latin-1）与 Windows-1252（也称为 CP1252）相同，除了代码点128-159（0x80-0x9F）之外。 ISO-8859-1 在此范围内分配了几个控制代码。 Windows-1252 将多个字符、标点符号、算术和商业符号分配给这些代码点。 https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html

在 .NET Core 中，你似乎在可用的编码数量上有一些限制。

运行在 .NET Core 上时，示例产生以下输出：

Info.CodePage	Info.Name	Info.DisplayName
1200	utf-16	Unicode
1201	utf-16BE	Unicode (Big-Endian)
12000	utf-32	Unicode (UTF-32)
12001	utf-32BE	Unicode (UTF-32 Big-Endian)
20127	us-ascii	US-ASCII
28591	iso-8859-1	Western European (ISO)
65000	utf-7	Unicode (UTF-7)
65001	utf-8	Unicode (UTF-8)

void Main()
{
	using var reader = new StreamReader(@&quot;C:\Users\myName\Documents\TestUmlauts.csv&quot;, 
		Encoding.Latin1);
	using var csv = new CsvReader(reader, CultureInfo.InvariantCulture);
	
	var records = csv.GetRecords<Foo>();
}

public class Foo 
{
	public int Id { get; set; }
	public string Name { get; set; }
}

英文:

I often have this issue when someone saves an Excel file as CSV (Comma delimited)(*.csv) rather than as CSV UTF-8 (Comma delimited)(*.csv). Depending on the country it is saved in, this often means it was saved as Windows 1252 encoding. In most cases, you can get away with using ISO-8859-1 encoding, also known as Latin-1 encoding, when reading the file with StreamReader. If you still have some characters that are not getting read correctly, you may have to use the exact encoding that was used to save the file.
> ISO-8859-1 (also called Latin-1) is identical to Windows-1252 (also called CP1252) except for the code points 128-159 (0x80-0x9F). ISO-8859-1 assigns several control codes in this range. Windows-1252 has several characters, punctuation, arithmetic and business symbols assigned to these code points. https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html

In .NET Core it looks like you are a bit limited as to the number of encodings available to you.

>The example produces the following output when run on .NET Core:

Info.CodePage	Info.Name	Info.DisplayName
1200	utf-16	Unicode
1201	utf-16BE	Unicode (Big-Endian)
12000	utf-32	Unicode (UTF-32)
12001	utf-32BE	Unicode (UTF-32 Big-Endian)
20127	us-ascii	US-ASCII
28591	iso-8859-1	Western European (ISO)
65000	utf-7	Unicode (UTF-7)
65001	utf-8	Unicode (UTF-8)

void Main()
{
	using var reader = new StreamReader(@&quot;C:\Users\myName\Documents\TestUmlauts.csv&quot;, 
		Encoding.Latin1);
	using var csv = new CsvReader(reader, CultureInfo.InvariantCulture);
	
	var records = csv.GetRecords&lt;Foo&gt;();
}

public class Foo 
{
	public int Id { get; set; }
	public string Name { get; set; }
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

CSVHelper – 特殊字符问题

问题

答案1

需要检查附加在HTTP请求中的证书详细信息。

如何减少IF语句的数量

Golang – 数据传输对象（DTO）、实体和映射

为什么在C#中使用`list.Sort((x, y) => x – y)`不会发生内存分配。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论