字节流的内部工作原理。write(65) 和 write(‘A’) 之间的区别是什么?

huangapple go评论78阅读模式
英文:

Internal working of byte streams. Difference between write(65) and write('A')?

问题

这两行都会在一个 file 中写入 字母 A。有人可以告诉我它们在内部工作上有什么不同吗?

FileOutputStream fileOutputStream = new FileOutputStream("test.txt");
fileOutputStream.write(65);
fileOutputStream.write('A');

[编辑]: 我更关心的是在这两种情况下转换的工作方式。因为我知道 ASCII 和 UNICODE 表是什么。

英文:

Both of these lines will write me letter A in a file. Can someone tell me how are they different in internal working?

FileOutputStream fileOutputStream = new FileOutputStream("test.txt");
fileOutputStream.write(65);
fileOutputStream.write('A');

[EDIT]: I am more interested in how converting works in both of these cases. As I know what ASCII and UNICODE tables are.

答案1

得分: 6

让我们从FileOutputStreamjavadoc开始。如果您查看它,您会看到有三种write方法:

  • void write​(byte[] b) - 从指定的字节数组向此文件输出流写入b.length字节。
  • void write​(byte[] b, int off, int len) - 从指定的字节数组在偏移量off处开始,向此文件输出流写入len字节。
  • void write​(int b) - 将指定的字节写入此文件输出流。

那么这告诉我们什么?

  1. 很明显,当我们调用fileOutputStream.write(65)fileOutputStream.write('A')时,我们并没有调用write方法的第一个或第二个重载。我们实际上调用的是第三个。

  2. 因此,当我们调用fileOutputStream.write('A')时,字符值'A'会被转换为整数值。这种转换是从charint原始扩宽转换。它相当于进行显式类型转换,即(int) 'A'

  3. 从整数类型(例如char)到较大整数类型(例如int)的原始扩宽转换只是让它变得更大。在这种情况下,我们只是在前面添加了16个零位(因为char是无符号的,而int是有符号的)。

  4. 当我们查看ASCII和Unicode代码表时,我们发现它们对于大写字母A都使用相同的值。它在十进制中是65,十六进制中是41。换句话说,(int) 'A'65是相同的值。

  5. 因此,当您考虑隐式扩宽转换时,fileOutputStream.write('A')fileOutputStream.write(65)实际上是使用相同的参数值调用write

  6. 最后,write(int)的javadoc提到了这个javadoc,其中写道:

    将指定的字节写入此输出流。write的一般约定是向输出流写入一个字节。要写入的字节是参数b的低8位。 b的高24位将被忽略。

    这解释了整数如何神奇地变成字节。


请注意,这只会打印出合理的内容,因为我们选择了ASCII字符集中的一个字符。恰好Unicode选择将ASCII字符集作为前128个Unicode代码点的镜像。

如果您将A替换为ASCII字符集之外的某个字符,OutputStream::write(int)很有可能会损坏它。

在输出文本时最好使用FileWriter而不是FileOutputStream

英文:

Let's start with the javadoc for FileOutputStream. If you look at it you will see that there are three write methods:

> - void write​(byte[] b) - Writes b.length bytes from the specified byte array to this file output stream.
> - void write​(byte[] b, int off, int len) - Writes len bytes from the specified byte array starting at offset off to this file output
> stream.
> - void write​(int b) - Writes the specified byte to this file output stream.

So what does it tell us?

  1. Clearly, when we call fileOutputStream.write(65) or fileOutputStream.write('A'), we are NOT calling the first or second overloads of the write method. We are actually calling the third one.

  2. Therefore when we call fileOutputStream.write('A'), the char value 'A' is converted to an int value. This conversion is a primitive widening conversion from char to int. It is equivalent to doing an explicit type cast; i.e (int) 'A'.

  3. Primitive widening casts from an integer type (e.g. char) to a larger integer type (e.g. int) is simply a matter of making it bigger. In this case we just add 16 zero bits on the front. (Because char is unsigned and int is signed.)

  4. When we look at the ASCII and Unicode code tables we see that they both use the same value for the letter capital A. It is 65 in decimal or 41 in hexadecimal. In other words (int) 'A' and 65 are the same value.

  5. So when you take the implicit widening conversion that happens, fileOutputStream.write('A') and fileOutputStream.write(65) are actually calling write with the same parameter value.

  6. Finally, the javadoc for write(int) refers to this javadoc which says:

    > Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.

    That explains how the int magically turns into a byte.


Note that this only prints something sensible because we picked a character in the ASCII character set. It so happens that Unicode chose to mirror the ASCII character set as the first 128 Unicode codepoints.

If you replaced A with some character outside of the ASCII charset, there is a good chance that OutputStream::write(int) would garble it.

It is better to use a FileWriter rather than a FileOutputStream when outputting text.

huangapple
  • 本文由 发表于 2020年9月5日 22:01:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/63754737.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定