Spring Boot随机SSLException:在带有JDK11的Kubernetes中重置连接
上下文:
- 我们有一个Spring Boot(2.3.1.RELEASE)Web应用程序
- 它是用Java 8编写的,但在使用Java 11(
openjdk:11.0.6-jre-stretch
)的容器中运行。 - 它有一个数据库连接和一个通过HTTPS(简单RestTemplate#交换方法)调用的上游服务(这很重要!)
- 部署在Kubernetes集群内(不确定这是否重要)
问题:
- 每天,我都会看到一小部分针对上游服务的请求失败,错误为:
I/O error on GET request for "https://upstream.xyz/path": Connection reset; nested exception is javax.net.ssl.SSLException: Connection reset
- 错误完全是随机的,并且间歇性地发生。
- 我们遇到过与JRE11和TLS 1.3协商问题相关的类似错误(
javax.net.ssl.SSLProtocolException: Connection reset
)。我们已将Docker映像更新为上述内容,并已修复该问题。 - 这是来自错误的堆栈跟踪:
java.net.SocketException: Connection reset
at java.base/java.net.SocketInputStream.read(Unknown Source)
at java.base/java.net.SocketInputStream.read(Unknown Source)
at java.base/sun.security.ssl.SSLSocketInputRecord.read(Unknown Source)
at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(Unknown Source)
at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(Unknown Source)
at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(Unknown Source)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.springframework.http.client.HttpComponentsClientHttpRequest.executeInternal(HttpComponentsClientHttpRequest.java:87)
at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53)
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:739)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:674)
at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:583)
....
配置:
public static RestTemplate create(final int maxTotal, final int defaultMaxPerRoute,
final int connectTimeout, final int readTimeout,
final String userAgent) {
final Registry<ConnectionSocketFactory> schemeRegistry = RegistryBuilder.<ConnectionSocketFactory>create()
.register("http", PlainConnectionSocketFactory.getSocketFactory())
.register("https", SSLConnectionSocketFactory.getSocketFactory())
.build();
final PoolingHttpClientConnectionManager connManager = new PoolingHttpClientConnectionManager(schemeRegistry);
connManager.setMaxTotal(maxTotal);
connManager.setDefaultMaxPerRoute(defaultMaxPerRoute);
final CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(connManager)
.setUserAgent(userAgent)
.setDefaultRequestConfig(RequestConfig.custom()
.setConnectTimeout(connectTimeout)
.setSocketTimeout(readTimeout)
.setExpectContinueEnabled(false).build())
.build();
return new RestTemplateBuilder()
.requestFactory(() -> new HttpComponentsClientHttpRequestFactory(httpClient))
.build();
}
有人遇到过这个问题吗? 当我打开http客户端上的调试日志时,它充满了噪音,我无法辨别出任何有用的东西...
解决方案
我们在迁移到aws/kubernetes时遇到了类似的问题。 我已经找到原因了。
您正在使用连接池。PoolingHttpClientConnectionManager的默认行为是它将重复使用连接。因此,当您的请求完成时,连接不会立即关闭。这将节省资源,因为不必一直重新连接。
Kubernetes集群使用NAT(网络地址转换)进行传出连接。当某个连接在一段时间内未使用时,该连接将从NAT表中移除,并且该连接将被断开。这会导致看似随机的SSLExceptions。
在AWS上,当NAT表处于空闲状态350秒时,连接将从NAT表中删除。其他Kubernetes实例可能有其他设置。参见https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-troubleshooting.html
解决方案:
禁用连接重用:
final CloseableHttpClient closeableHttpClient = HttpClients.custom()
.setConnectionReuseStrategy(NoConnectionReuseStrategy.INSTANCE)
.setConnectionManager(poolingHttpClientConnectionManager)
.build();
或,让httpClient驱逐空闲时间过长的连接:
return HttpClients.custom()
.evictIdleConnections(300, TimeUnit.SECONDS) //Read the javadocs, may not be used when the instance of HttpClient is created inside an EJB container.
.setConnectionManager(poolingHttpClientConnectionManager)
.build();
或使用永远不返回-1或超时值超过300秒的自定义KeepAliveStrategy调用setConnectionKeepAliveStrategy(....)
。
相关文章