使用OutputStream发送特殊字符时出现问题,字符集为UTF-8时。

huangapple go评论67阅读模式
英文:

Error when sending special characters using OutputStream when the charset is UTF-8

问题

我正在使用 HttpServer 创建一个简单的服务。当我使用不带特殊字符的字符串时,该服务可以正常工作。

public static void main(String arg[]) throws Exception {
    HttpServer server = HttpServer.create(new InetSocketAddress(serverPort), 0);
    server.createContext("/notification", new MyHandler());
    server.setExecutor(null); // creates a default executor
    server.start();
}

static class MyHandler implements HttpHandler {
    public void handle(HttpExchange t) throws IOException {
        String response;

        response = "带有特殊字符 éáã "; // 无法工作
        response = "没有特殊字符"; // 可以工作!

        String encoding = "UTF-8";

        System.out.println(response);

        t.getResponseHeaders().set("Content-Type", "application/json; charset=" + encoding);

        t.sendResponseHeaders(200, response.length());
        byte[] bytes = response.getBytes(StandardCharsets.UTF_8);
        OutputStream os = t.getResponseBody();
        os.write(bytes);
        os.flush();

        os.close();
    }
}

当我的 UTF-8 字符串包含特殊字符时,会返回以下错误:

java.io.IOException: 写入流的字节过多
	at sun.net.httpserver.FixedLengthOutputStream.write(FixedLengthOutputStream.java:76)
	at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
	at sun.net.httpserver.PlaceholderOutputStream.write(ExchangeImpl.java:439)
	at InternalServer$MyHandler.handle(InternalServer.java:86)
	at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
	at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
	at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
	at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
	at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
	at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
	at sun.net.httpserver.ServerImpl$DefaultExecutor.execute(ServerImpl.java:158)
	at sun.net.httpserver.ServerImpl$Dispatcher.handle(ServerImpl.java:431)
	at sun.net.httpserver.ServerImpl$Dispatcher.run(ServerImpl.java:396)
	at java.lang.Thread.run(Thread.java:748)
英文:

I am creating a simple service using HttpServer. The service works correctly when I use strings without special characters.

public static void main (String arg []) throws Exception {
    HttpServer server = HttpServer.create(new InetSocketAddress(serverPort), 0);
    server.createContext("/notification", new MyHandler());
    server.setExecutor(null); // creates a default executor
    server.start();
}

static class MyHandler implements HttpHandler {
    public void handle(HttpExchange t) throws IOException {
        String response;

        response = "with special characters éáã "; // it doesn't work
        response = "without special characters"; // it works!

        String encoding = "UTF-8";

        System.out.println(response);

        t.getResponseHeaders().set("Content-Type", "application/json; charset=" + encoding);

        t.sendResponseHeaders(200, response.length());
        byte[] bytes = response.getBytes(StandardCharsets.UTF_8);
        OutputStream os = t.getResponseBody();
        os.write(bytes);
        os.flush();

        os.close();
    }
}

When my UTF-8 string has special characters, it return this error:

java.io.IOException: too many bytes to write to stream
at sun.net.httpserver.FixedLengthOutputStream.write(FixedLengthOutputStream.java:76)
at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
at sun.net.httpserver.PlaceholderOutputStream.write(ExchangeImpl.java:439)
at InternalServer$MyHandler.handle(InternalServer.java:86)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
at sun.net.httpserver.ServerImpl$DefaultExecutor.execute(ServerImpl.java:158)
at sun.net.httpserver.ServerImpl$Dispatcher.handle(ServerImpl.java:431)
at sun.net.httpserver.ServerImpl$Dispatcher.run(ServerImpl.java:396)
at java.lang.Thread.run(Thread.java:748)

答案1

得分: 2

这是问题所在。

 t.sendResponseHeaders(200, response.length());

sendResponseHeaders 的第二个参数必须是你将要发送的内容的确切大小,以字节为单位。但你传递的是字符串的长度,以字符为单位。

在 UTF-8 中,任何大于 U+0080 的字符都将被编码为2个或更多字节。你的第二个示例字符串包含大于 U+0080 的字符,因此在 UTF-8 中进行编码时,字符计数和字节计数是不同的。你将会在响应标头中设置错误的内容长度。

看起来 HttpExchange 提供的输出流正在检查你发送的字节是否超过了响应标头中设置的字节大小。(这将违反 HTTP 协议。)

解决方案:

 ...
 byte[] bytes = response.getBytes(StandardCharsets.UTF_8);
 t.sendResponseHeaders(200, bytes.length);
 ...

也可以将内容长度设置为 0。这将导致使用分块传输编码发送主体。

英文:

Here is the problem.

 t.sendResponseHeaders(200, response.length());

The second parameter to sendResponseHeaders must be the exact size of the content you are going to send in bytes. But you are passing the length of a string in characters.

In UTF-8, any character that is larger than U+0080 will be encoded as 2 or more bytes. Your second example string contains characters that are larger than U+0080, so when it is encoded in UTF-8, the character count and byte count are different. You will be setting the incorrect content length in the response header

It looks like the output stream provided by HttpExchange is checking that you do not send more bytes than you set in the response header. (That would be an HTTP protocol violation.)

Solution:

 ...
 byte[] bytes = response.getBytes(StandardCharsets.UTF_8);
 t.sendResponseHeaders(200, bytes.length);
 ...

It is also possible to pass 0 as the content length. That will cause the body to be sent using chunked transfer encoding.

huangapple
  • 本文由 发表于 2020年10月26日 20:13:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/64536915.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定