问题
客户端发送HTTPS请求报错:SocketException: Connection or inbound has closed,服务端收到了请求,异常发生在客户端解析服务端响应阶段,并且线程出现阻塞现象,异常堆栈日志卡了几十秒才打印出来,部分异常堆栈:
Caused by: java.net.SocketException: Connection or inbound has closed
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:934)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.apache.http.client.fluent.Request.internalExecute(Request.java:173)
at org.apache.http.client.fluent.Executor.execute(Executor.java:262)
问题发生的背景是在进行系统架构升级后的新老系统切换期间,相同的代码在老系统运行无异常,只有在新系统运行才会抛出上述异常,猜测可能是新老系统运行环境的差异导致了上述问题,进行环境对比后发现jdk版本不一致,新系统的JDK版本为1.8.0_311,而老系统运行的1.8.0_172版本的 SSLSocketImpl 类是不会抛出 java.net.SocketException: Connection or inbound has closed 这段错误的,于是决定偷个懒把新系统JDK版本改成和老系统一样来规避这个问题,但是又出现了新错误 ProtocolException: The server failed to respond with a valid HTTP response:
Caused by: org.apache.http.ProtocolException: The server failed to respond with a valid HTTP response
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:149)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
排查
没办法,只能动手查了,根据问题现象,存在线程阻塞,先找到线程阻塞的原因,执行 jstack -l pid 并根据线程名称输出阻塞线程的堆栈:
"DubboServerHandler-192.168.163.230:9461-thread-199" #8857 daemon prio=5 os_prio=0 tid=0x00007fba15f74000 nid=0x6b40 waiting for monitor entry [0x00007fb9fe6f2000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:87)
- waiting to lock <0x000000073b3af7b0> (a sun.security.ssl.AppInputStream)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.apache.http.client.fluent.Request.internalExecute(Request.java:173)
at org.apache.http.client.fluent.Executor.execute(Executor.java:262)
通过堆栈发现线程阻塞的原因是因为在等一个锁,继续搜索线程堆栈找到“0x000000073b3af7b0”这个锁的当前持有线程:
"data-collect-0" #790 daemon prio=5 os_prio=0 tid=0x00007fba650d4000 nid=0x3ed runnable [0x00007fba14efa000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
- locked <0x000000073b3af138> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
- locked <0x000000073b3af7b0> (a sun.security.ssl.AppInputStream)
这里只粘贴了部分堆栈,通过堆栈信息定位到是公司内部的自动化测试流量采集框架的 worker 线程在执行流量采集时与 dubbo 线程之间发生了锁竞争,导致 dubbo 线程被阻塞。果断停掉它,问题得到解决(实际也是查了好久,坑啊!!!)。