现象描述
Hive环境一个数据库,拥有表8000+,业务代码需要挨个desc {tableName}
来获取表信息。当程序运行到4000+左右,开始出现获取信息失败。
查找原因
通过查看日志发现是连接已断开,具体如下:
【INFO】org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: java.sql.SQLException: Connection has already been closed.
### The error may exist in 省略/HiveCollectionMapper.xml
### The error may involve 省略.HiveCollectionMapper.getTableParameters
### The error occurred while executing a query
### SQL: desc formatted logsget.`dim_qdas_superderive_eventlist`
### Cause: java.sql.SQLException: Connection has already been closed.
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:149)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:147)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:80)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:57)
at com.sun.proxy.$Proxy365.getTableParameters(Unknown Source)
...省略
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:800)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1471)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: Connection has already been closed.
at org.apache.tomcat.jdbc.pool.ProxyConnection.invoke(ProxyConnection.java:117)
at org.apache.tomcat.jdbc.pool.JdbcInterceptor.invoke(JdbcInterceptor.java:108)
at org.apache.tomcat.jdbc.pool.interceptor.AbstractCreateStatementInterceptor.invoke(AbstractCreateStatementInterceptor.java:75)
at org.apache.tomcat.jdbc.pool.JdbcInterceptor.invoke(JdbcInterceptor.java:108)
at org.apache.tomcat.jdbc.pool.DisposableConnectionFacade.invoke(DisposableConnectionFacade.java:81)
at com.sun.proxy.$Proxy342.prepareStatement(Unknown Source)
at sun.reflect.GeneratedMethodAccessor280.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.ibatis.logging.jdbc.ConnectionLogger.invoke(ConnectionLogger.java:55)
at com.sun.proxy.$Proxy360.prepareStatement(Unknown Source)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.instantiateStatement(PreparedStatementHandler.java:86)
at org.apache.ibatis.executor.statement.BaseStatementHandler.prepare(BaseStatementHandler.java:88)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.prepare(RoutingStatementHandler.java:59)
at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:86)
at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:62)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:136)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147)
分析日志:
在mybatis调用DefaultSqlSession.selectList()方法时,突然发现Connect已经断开了,而上一次还可以正常运行这个方法。太突然了!笔者感觉DEBUG日志有更具体的原因,还是开启DEBUG日志,重新跑一次看看。
开启DEBUG日志,重新跑一次这个过程,果然在DEBUG日志中发现了端倪。在抛出上面的java.sql.SQLException: Connection has already been closed.
异常之前抛出了下面的异常:
【DEBUG】org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: java.sql.SQLException: Error retrieving next row
### The error may exist in 省略/HiveCollectionMapper.xml
### The error may involve 省略.getTableParameters
### The error occurred while handling results
### SQL: desc formatted logsget.`dim_qdas_region`
### Cause: java.sql.SQLException: Error retrieving next row
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:149)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:147)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:80)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:57)
at com.sun.proxy.$Proxy365.getTableParameters(Unknown Source)
...省略
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:800)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1471)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: Error retrieving next row
at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:388)
at sun.reflect.GeneratedMethodAccessor277.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.ibatis.logging.jdbc.ResultSetLogger.invoke(ResultSetLogger.java:69)
at com.sun.proxy.$Proxy363.next(Unknown Source)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleRowValuesForSimpleResultMap(DefaultResultSetHandler.java:352)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleRowValues(DefaultResultSetHandler.java:328)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleResultSet(DefaultResultSetHandler.java:301)
at org.apache.ibatis.executor.resultset.DefaultResultSetHandler.handleResultSets(DefaultResultSetHandler.java:194)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:65)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:79)
at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:63)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:136)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147)
... 97 common frames omitted
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Socket closed
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:460)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435)
at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_FetchResults(TCLIService.java:501)
at org.apache.hive.service.cli.thrift.TCLIService$Client.FetchResults(TCLIService.java:488)
at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:360)
... 113 common frames omitted
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 125 common frames omitted
日志分析:
上面异常从hive-jdbc依赖的jar包libthrift中抛出的,大致过程可以整理如下:
Hive-jdbc中Socket断开—> Mybatis Connection中断—> Tomcat-jdbc 连接断开。
笔者打算从距离业务代码最近的Tomcat-jdbc入手,解决这个问题,开启debug模式,查看tomcat-jdbc中的连接信息,截个图:
testOnBorrow、testWhileIdle、timeBetweenEvictionRunsMillis等几个重要的参数貌似都没问题,郁闷…这里有好几个时间值,究竟是哪个设置的不合理导致了连接失效?
代码不是获取8000+表的信息吗?不如统计一下从第一个正常的到第一个不正常的中间经过了多长时间,然后比较这个时间和上图各时间值,也许能找到突破口。
Do it !
果然日志是最好的fixbug资料,找到了两个时间点如下,大概相隔6分钟:
2019-12-24 14:30:44 [INFO] --第一个正常
2019-12-24 14:36:49 [DEBUG] --第一个失败
不能如此武断定了removeAbandonedTimeout的罪,在跑两次,结果间隔时间依然是6分钟。OK,那就是tomcat-jdbc干的了?在数据库连接服务了6分钟之后,被当做遗弃的连接给remove了?调整这个removeAbandonedTimeout=1200,再试一次~
2019-12-24 16:07:20 [INFO] --第一个正常
2019-12-24 16:27:25 [DEBUG] --第一个失败
应该可以下结论了,的确是removeAbandonedTimeout时间一到,连接就断开了。
解决办法
笔者想通过关掉removeAbandoned功能removeAbandoned=false
,来解决这个问题。但心里仍有疑惑:
- 为什么Connection一直在使用中,缺会被removeAbandoned功能主动中断,这个abandoned怎么来判断的?
接下来笔者想通过阅读tomcat-jdbc源码来获得答案。也欢迎网友们提供参考链接。