将UTF-8数据转换为带有BOM的UTF-16。

huangapple go评论74阅读模式
英文:

Converting UTF-8 data to UTF-16 with BOM

问题

我有一个名为csv的变量,它包含我需要的所有数据。我想将这些数据转换为UTF-16LE,因为我希望Excel只需打开文件就能识别数据,而不需要点击获取数据按钮。

我尝试使用这个解决方案,虽然LibreOffice Calc确实识别它是一个UTF-16文件,但Excel不能正确解码希腊字符。

以下是我拥有的JavaScript代码:

csv = csvRows.join('\r\n')

var byteArray = new Uint8Array(csv.length * 2);
for (var i = 0; i < csv.length; i++) {
  byteArray[i * 2] = csv.charCodeAt(i) // & 0xff;
  byteArray[i * 2 + 1] = csv.charCodeAt(i) >> 8 // & 0xff;
}

var blob = new Blob([byteArray], { type: 'text/csv', encoding: "UTF-16LE" });
var url = URL.createObjectURL(blob);
sendFileToClient(url, this.el.id + ".csv")

在Elixir中,我曾经遇到类似的问题,我是通过使用:unicode模块来解决的,如下所示:

csv =
  :unicode.characters_to_binary(
    :unicode.encoding_to_bom(:utf8) <> build_csv(data),
    :utf8,
    {:utf16, :little}
  )

我还尝试在文件开头添加\uFEFF BOM字符,但这会使其被识别为UTF-8 BOM文件(根据Notepad++的说法)。当我尝试\uFFFE时,这个UTF-BOM字符被转化为不同的字母。

LibreOffice Calc打开没有BOM的文件

LibreOffice Calc打开带有\ufeff BOM的文件

Excel打开没有BOM的文件

Excel打开带有BOM的文件

英文:

I have a variable called csv and it contains all the data I require. I want to convert this data to UTF-16LE as I want Excel to recognize the data by just opening the file, without the need of the Get Data button.

I tried using this solution, and while LibreOffice Calc does recognize it is a UTF-16 file, Excel does not decode the greek characters correctly.

Here is the JavaScript code I have:

csv = csvRows.join(&#39;\r\n&#39;)

      var byteArray = new Uint8Array(csv.length * 2);
      for (var i = 0; i &lt; csv.length; i++) {
        byteArray[i * 2] = csv.charCodeAt(i) // &amp; 0xff;
        byteArray[i * 2 + 1] = csv.charCodeAt(i) &gt;&gt; 8 // &amp; 0xff;
      }

      var blob = new Blob([byteArray], { type: &#39;text/csv&#39;, encoding: &quot;UTF-16LE&quot; });
      var url = URL.createObjectURL(blob);
      sendFileToClient(url, this.el.id + &quot;.csv&quot;)

A similar problem I had in Elixir I solved by using the :unicode module like so:

csv =
      :unicode.characters_to_binary(
        :unicode.encoding_to_bom(:utf8) &lt;&gt; build_csv(data),
        :utf8,
        {:utf16, :little}
      )

I tried also adding a \uFEFF BOM character at the beginning of the file but that makes it so its recognized as UTF-8 BOM file (according to Notepad++). When I tried \uFFFE which is the `UTF-BOM character was turned into different letters.

LibreOffice Calc opening the file without BOM

LibreOffice Calc opening the file with \ufeff BOM

Excel opening the file without BOM

Excel opening the file with BOM

答案1

得分: 0

我通过在将内容转换为UTF-16之前添加UTF-8 BOM来解决了这个问题。

英文:

I solved the problem by adding the UTF-8 BOM before converting the content to UTF-16.

huangapple
  • 本文由 发表于 2023年7月6日 19:51:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76628537.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定