io.netty.handler.timeout.ReadTimeoutException: AWS 上的空错误

huangapple go评论65阅读模式
英文:

io.netty.handler.timeout.ReadTimeoutException: null error on AWS

问题

We are experiencing the strangest problem. We are using WebFlux WebClient to make an API call. It was working fine, and then last week we started seeing the intermittent error:

io.netty.handler.timeout.ReadTimeoutException: null

To give some background. Our dev server is running on bare metal, and it is stable and has no issues at all. This is only happening in production on our AWS instance.

We see this happening on every second service call, which made me think load balancer, but my infrastructure team cannot see anything obvious.

This is the code that makes the service call.

Mono<String> responseMono = headersSpec.header("X-CORRELATION-ID", correlationId)
    .accept(MediaType.APPLICATION_JSON)
    .exchangeToMono(response -> {
        if (response.statusCode().equals(HttpStatus.OK)) {
            return response.bodyToMono(String.class);
        } else if (response.statusCode().equals(HttpStatus.BAD_REQUEST)) {
            return Mono.error(new DataViolationException("Data rejected","Data rejected",null,"ProductUse.useProduct"));
        } else {
            return Mono.error(new GeneralServiceException("Error status received: " + response.statusCode().toString() + " [" + response.bodyToMono(String.class) + "]",null));
        }
    });

I am wondering if it is an issue with our version of Spring? We are running the following:

Java Version: 18

Spring Version: 3.0.5

This application runs on Kubernetes.

Any suggestions would be much appreciated, at a bit of a loss here, have read other posts about this error but still unable to work out why it is happening to us, and why it only happens on AWS and not bare metal.

英文:

We are experiencing the strangest problem. We are using WebFlux WebClient to make an API call. It was working fine, and then last week we started seeing the intermittent error:

io.netty.handler.timeout.ReadTimeoutException: null

To give some background. Our dev server is running on bare metal, and it is stable and has no issues at all. This is only happening in production on our AWS instance.

We see this happening on every second service call, which made me think load balancer, but my infrastructure team cannot see anything obvious.

This is the code that makes the service call.

Mono&lt;String&gt; responseMono = headersSpec.header(&quot;X-CORRELATION-ID&quot;, correlationId)
		.accept(MediaType.APPLICATION_JSON)
		.exchangeToMono(response -&gt; {
			if (response.statusCode().equals(HttpStatus.OK)) {
				return response.bodyToMono(String.class);
			} else if (response.statusCode().equals(HttpStatus.BAD_REQUEST)) {
				return Mono.error(new DataViolationException(&quot;Data rejected&quot;,&quot;Data rejected&quot;,null,&quot;ProductUse.useProduct&quot;));
			} else {
				return Mono.error(new GeneralServiceException(&quot;Error status received: &quot; + response.statusCode().toString() + &quot; [&quot; + response.bodyToMono(String.class) + &quot;]&quot;,null));
			}
		});

I am wondering if it is an issue with our version of Spring? We are running the following:

Java Version: 18

Spring Version: 3.0.5

This application runs on Kubernetes.

Any suggestions would be much appreciated, at a bit of a loss here, have read other posts about this error but still unable to work out why it is happening to us, and why it only happens on AWS and not bare metal.

答案1

得分: 1

以下是已翻译的内容:

让我们详细分析这个问题...
ReadTimeoutException 可能会在指定的超时期内未收到服务器响应时发生。但在您的情况下,由于在生产环境中出现问题,而在开发服务器上没有出现问题,可能是您的AWS实例与外部服务之间存在网络问题。

现在,让我们一起解决这个问题。

  • 增加超时时长,看看是否解决了这个问题。您可以使用调试器逐步执行代码,并确定超时发生的位置。这可以让您了解问题是WebClient还是代码中的其他地方。

  • 可能存在AWS实例和外部服务之间的网络连接问题。您可以尝试使用PING或TRACEROUTE命令从AWS实例到外部服务进行网络连接测试。

  • 您还可以升级到更新版本的Spring和Java。看起来您正在使用较旧版本的Spring和Java,可能存在兼容性问题导致超时。您可以尝试升级到较新版本,看看是否解决了问题。

英文:

Let us break this down extensively...
The ReadTimeoutException can
occur when there is no response from the server within the specified
timeout period. But in your case, since the issue is happening
consistently on production and not on your development server, it is
possible that there could be a network issue between your AWS
instance and the external service.

Now, let us troubleshoot the problem together.

  • Increase the timeout duration and see if it resolves this issue. You can use a debugger to step through the code and determine where
    the timeout is occurring. This can give you an idea of whether the
    issue is with the WebClient or somewhere else in your code.

  • It is possible that there is an issue with the network connectivity between your AWS instance and the external service. You can try
    testing the network connectivity by using PING or TRACEROUTE commands
    from the AWS instance to the external service.

  • You can as well upgrade to a newer version of Spring and Java. It appears that you are using older versions of Spring and Java, it is
    possible that there are compatibility issues that are causing the
    timeouts. You can try upgrading to newer versions and see if it
    resolves the issue.

huangapple
  • 本文由 发表于 2023年5月17日 21:03:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76272426.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定