英文:
Get File encoding ASCII or EBCDIC with java
问题
我有一个文件,其扩展名为 .b3c,我想知道它是使用ASCII还是EBCDIC编码的,我应该如何在Java中实现这一点,请帮忙。
需要帮助。
谢谢
英文:
I have a file which extension is .b3c i want to know if it's encoded in ASCII or EBCDIC using java jow can I achieve that please.
Help is needed.
Thanks
答案1
得分: 2
假设文本文件包含多行文本,请检查是否有 换行 字符。
在ASCII中,行以 LF
/ \n
/ 0x0a
结尾。当然,在Windows中也有 CR
,但我们可以忽略掉这部分。
在EBCDIC中,行以 NL
/ \025
/ 0x15
结束。
ASCII文本文件中不会包含 0x15
/ NAK
,而EBCDIC文本文件中不会包含 0x0a
/ SMM
,因此要同时查找两者:
-
如果只找到其中一个,您就知道字符集。
-
如果两者都找到了,则该文件是二进制文件,而不是文本文件,因此请拒绝该文件。
-
如果两者都没有找到,则该文件可能只有一行文本,在这种情况下可能需要进一步分析。希望这在这里不是这种情况,所以迄今为止所做的简单测试应该足够了。
英文:
Assuming the text file contains multiple lines of text, check for the newline character.
In ASCII, lines end with an LF
/ \n
/ 0x0a
. Sure, on Windows there's also a CR
, but we can ignore that part.
In EBCDIC, lines end with an NL
/ \025
/ 0x15
.
ASCII text files will not contain a 0x15
/ NAK
, and EBCDIC text files will not contain a 0x0a
/ SMM
, so look for both:
-
If only one of them is found, you know the character set.
-
If both are found, the file is a binary file, and not a text file, so reject the file.
-
If neither is found, the file could have just one line of text, in which case further analysis might be needed. Hopefully that won't be the case for here, so the simple test done so far should be enough.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论