英文:
I have a column of Vietnamese characters that can't be properly read by R when imported as a .csv
问题
在将初始的Excel文件转换为.csv并导入R后,在Excel中看起来正常的字符在R中变得混乱。
当我检查列的编码时,我得到了ASCII、WINDOWS-1252和MAC-CENTRALEUROPE的混合。我希望它要么在R中以越南字符的形式呈现,要么全部转换为拉丁字符。
我尝试使用stringi包将列转换为一种编码,如UTF-8,以便我可以使用vietnameseConverter包将列转换为越南字符,或者使用Encoding()函数将列转换为拉丁字符。然而,该列仍然保持在多种不同的编码中。
英文:
After converting initial excel file into a .csv and importing this into R, the characters that look fine in excel become garbled in R.
When I check the encoding of the column, I get a mix of ASCII, WINDOWS-1252, and MAC-CENTRALEUROPE. I'd like it to be either presented with Vietnamese characters in R, or all converted to Latin characters.
I tried using the stringi package to convert the column into one encoding like UTF-8, so that I could use the vietnameseConverter package to convert the column into Vietnamese characters or the Encoding() function to turn the column into Latin characters. However, the column remains in multiple different encodings.
答案1
得分: 1
在导出为CSV文件时,Excel可能不会始终保留字符的原始编码,导致在导入到R时出现乱码文本。
- 将文件保存为带有UTF-8编码的CSV。
- 在R中,使用
read.csv()
函数读取CSV文件,并将fileEncoding参数设置为"UTF-8"。
如果仍然遇到编码问题,可以尝试使用R中的iconv()
函数将列的字符编码转换为UTF-8。
英文:
When exporting to a CSV file, Excel may not always preserve the original encoding of the characters, leading to garbled text when importing into R.
- Save the file as a CSV with UTF-8 encoding
- In R, read in the CSV file using the
read.csv()
function and set the fileEncoding argument to "UTF-8".
If you still encounter issues with the encoding, you can try using the iconv()
function in R to convert the character encoding of the column to UTF-8
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论