英文:
Connection issues through AWS Nat Gateway
问题
我在私有子网中有一台搭载有Spring Boot应用程序的Amazon Linux 2应用服务器。
在公共子网中,该应用服务器前面有一个NAT网关。
应用程序使用Connection: keep-alive标头向远程主机发送请求,远程主机使用相同的标头回复请求。
因此,我可以通过netstat看到已建立的连接。
netstat -t | grep <远程服务器IP>
tcp6 0 0 ip-172-30-4-31.eu:57324 <远程服务器IP>:http ESTABLISHED
根据这份文档:https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html#nat-gateway-troubleshooting-timeout,由于在350秒内没有流量,NAT网关会关闭连接。
但是,在应用程序服务器上,连接仍然处于已建立状态,因此对远程服务器的下一次请求会导致:
java.net.SocketException: Connection reset
我尝试在sysctl.conf中对应用程序服务器进行更改,以便与NAT网关几乎同时关闭连接:
net.ipv4.tcp_keepalive_time=351
net.ipv4.tcp_keepalive_intvl=30
net.ipv4.tcp_keepalive_probes=2
但是什么都没有发生,通过tcpdump将应用程序服务器到远程服务器的流量转储给我,没有看到保活数据包。
除了从我的应用程序中删除Connection标头之外,我还能做些什么来避免这个问题?
英文:
I have an Amazon Linux 2 application server with the Spring Boot application aboard in the private subnet.
There is a Nat gateway in front of that application server in the public subnet.
Application sends a request with Connection: keep-alive header to the remote host and the remote host sends a response back with the same header.
So I can see an established connection via netstat.
netstat -t | grep <remote server ip>
tcp6 0 0 ip-172-30-4-31.eu:57324 <remote server ip>:http ESTABLISHED
Because of no traffic for 350 sec Nat gateway closes connection according to this document: https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html#nat-gateway-troubleshooting-timeout
But the connection is still in Established state on the application server, so the next request to the remote server gives me:
java.net.SocketException: Connection reset
I've tried to make changes at the application sever in sysctl.conf to close the connection almost simultaneously with a Nat Gateway:
net.ipv4.tcp_keepalive_time=351
net.ipv4.tcp_keepalive_intvl=30
net.ipv4.tcp_keepalive_probes=2
But nothing happens and dumping traffic from the application server to the remote server via tcpdump gives me no keep-alive packets.
So what can I do to avoid this problem except removing the Connection header in my application?
答案1
得分: 4
问题出在打开套接字的方法上。
我使用了Apache Fluent API:
Request.Post(mainProperties.getPartnerURL())
.addHeader("Signature", SecurityHelper.getSignature(requestBody.getBytes("UTF-8"),
mainProperties.getPartnerKey()))
.addHeader("Content-Type", "application/x-www-form-urlencoded")
.connectTimeout(mainProperties.getRequestTimeoutMillis())
.bodyByteArray(requestBody.getBytes(UTF_8))
.execute().returnContent().asString();
但我将so_keepalive
参数设置给了套接字。这可以使用HttpClient来完成:
SocketConfig socketConfig = SocketConfig.custom()
.setSoKeepAlive(true)
.build();
RequestConfig requestConfig = RequestConfig.custom()
.setConnectTimeout(mainProperties.getRequestTimeoutMillis())
.build();
CloseableHttpClient httpClient = HttpClientBuilder.create()
.setDefaultSocketConfig(socketConfig)
.setDefaultRequestConfig(requestConfig)
.build();
HttpPost post = new HttpPost(mainProperties.getPartnerURL());
post.addHeader("Signature", SecurityHelper.getSignature(requestBody.getBytes("UTF-8"),
mainProperties.getPartnerKey()));
post.addHeader("Content-Type", "text/xml");
post.setEntity(new StringEntity(requestBody, UTF_8));
CloseableHttpResponse response = httpClient.execute(post);
return EntityUtils.toString(response.getEntity(), UTF_8);
然后,我在我的sysctl.conf
中设置了net.ipv4.tcp_keepalive_time=350
(需要使用sysctl -p
应用更改),这些更改会应用于新的连接,可以像这样检查:
netstat -o | grep <remote-host>
tcp6 0 0 ip-172-30-4-233.e:50414 <remote-host>:http ESTABLISHED keepalive (152.12/0/0)
因此,从上一个没有响应的数据包开始,经过350秒后发送TCP Keep-Alive数据包,关闭已建立的连接。所有TCP Keep-Alive数据包可以通过tcp dump查看。
英文:
The problem was because of the method used to open the socket.
I've used Apache Fluent API:
Request.Post(mainProperties.getPartnerURL())
.addHeader("Signature", SecurityHelper.getSignature(requestBody.getBytes("UTF-8"),
mainProperties.getPartnerKey()))
.addHeader("Content-Type", "application/x-www-form-urlencoded")
.connectTimeout(mainProperties.getRequestTimeoutMillis())
.bodyByteArray(requestBody.getBytes(UTF_8))
.execute().returnContent().asString();
But I had set so_keepalive param to the socket. It could be done using the HttpClient:
SocketConfig socketConfig = SocketConfig.custom()
.setSoKeepAlive(true)
.build();
RequestConfig requestConfig = RequestConfig.custom()
.setConnectTimeout(mainProperties.getRequestTimeoutMillis())
.build();
CloseableHttpClient httpClient = HttpClientBuilder.create()
.setDefaultSocketConfig(socketConfig)
.setDefaultRequestConfig(requestConfig)
.build();
HttpPost post = new HttpPost(mainProperties.getPartnerURL());
post.addHeader("Signature", SecurityHelper.getSignature(requestBody.getBytes("UTF-8"),
mainProperties.getPartnerKey()));
post.addHeader("Content-Type", "text/xml");
post.setEntity(new StringEntity(requestBody, UTF_8));
CloseableHttpResponse response = httpClient.execute(post);
return EntityUtils.toString(response.getEntity(), UTF_8);
Then net.ipv4.tcp_keepalive_time=350 set in my sysctl.conf (sysctl -p needed to apply changes) are applied to a new connection, it could be checked like this:
netstat -o | grep <remote-host>
tcp6 0 0 ip-172-30-4-233.e:50414 <remote-host>:http ESTABLISHED **keepalive (152.12/0/0)**
So TCP-Keep-Alive packet sent after 350 sec from the last packet with no response closes the ESTABLISHED connection. All TCP-Keep-Alive packets can be seen via tcp dump:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论