SpringBoot-Log4j2组件引起阻塞hung住问题排查一例

一、问题现象

 基于SpringBoot的jar包运行的时间会比较长,在运行过程中,进程hung在那里,不再有日志输出,数据库也并没有一直在执行的SQL任务。

二、问题排查

使用Jstack导出java的线程信息如下:

2021-02-22 18:46:38
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode):

"Attach Listener" #99 daemon prio=9 os_prio=0 tid=0x00007f4478001000 nid=0x18f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"HikariPool-1 housekeeper" #24 daemon prio=5 os_prio=0 tid=0x00007f451e5b8000 nid=0x98 waiting for monitor entry [0x00007f449481b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.apache.logging.log4j.core.appender.OutputStreamManager.writeBytes(OutputStreamManager.java:360)
	- waiting to lock <0x000000008031aea8> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
	at org.apache.logging.log4j.core.layout.TextEncoderHelper.writeEncodedText(TextEncoderHelper.java:96)
	at org.apache.logging.log4j.core.layout.TextEncoderHelper.encodeText(TextEncoderHelper.java:65)
	at org.apache.logging.log4j.core.layout.StringBuilderEncoder.encode(StringBuilderEncoder.java:68)
	at org.apache.logging.log4j.core.layout.StringBuilderEncoder.encode(StringBuilderEncoder.java:32)
	at org.apache.logging.log4j.core.layout.PatternLayout.encode(PatternLayout.java:220)
	at org.apache.logging.log4j.core.layout.PatternLayout.encode(PatternLayout.java:58)
	at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:177)
	at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:170)
	at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:161)
	at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
	at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
	at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
	at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
	at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:448)
	at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:433)
	at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417)
	at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:403)
	at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63)
	at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:146)
	at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2163)
	at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2118)
	at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2101)
	at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2006)
	at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1875)
	at org.apache.logging.slf4j.Log4jLogger.debug(Log4jLogger.java:134)
	at com.zaxxer.hikari.pool.HikariPool.logPoolState(HikariPool.java:404)
	at com.zaxxer.hikari.pool.HikariPool$HouseKeeper.run(HikariPool.java:776)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

"Abandoned connection cleanup thread" #22 daemon prio=5 os_prio=0 tid=0x00007f451c93e800 nid=0x97 in Object.wait() [0x00007f4495bb1000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
	- locked <0x00000000816e8be0> (a java.lang.ref.ReferenceQueue$Lock)
	at com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:64)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

"Service Thread" #17 daemon prio=9 os_prio=0 tid=0x00007f451c2ca800 nid=0x93 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00007f451c2c7800 nid=0x92 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00007f451c2c6000 nid=0x91 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00007f451c2c4000 nid=0x90 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00007f451c2c2000 nid=0x8f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread7" #12 daemon prio=9 os_prio=0 tid=0x00007f451c2c0000 nid=0x8e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread6" #11 daemon prio=9 os_prio=0 tid=0x00007f451c2be000 nid=0x8d waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread5" #10 daemon prio=9 os_prio=0 tid=0x00007f451c2bc000 nid=0x8c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread4" #9 daemon prio=9 os_prio=0 tid=0x00007f451c2ba000 nid=0x8b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007f451c2b8000 nid=0x8a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f451c2b6000 nid=0x89 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f451c2b4000 nid=0x88 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f451c2b1000 nid=0x87 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f451c2af800 nid=0x86 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f451c27d800 nid=0x85 in Object.wait() [0x00007f44973f2000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
	- locked <0x00000000802ca438> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f451c278800 nid=0x84 in Object.wait() [0x00007f44974f3000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:502)
	at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
	- locked <0x00000000802ca668> (a java.lang.ref.Reference$Lock)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00007f451c009800 nid=0x75 runnable [0x00007f4523eb2000]
   java.lang.Thread.State: RUNNABLE
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:326)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
	- locked <0x00000000802ba940> (a java.io.BufferedOutputStream)
	at java.io.PrintStream.write(PrintStream.java:482)
	- locked <0x00000000802ba920> (a java.io.PrintStream)
	at org.apache.logging.log4j.core.appender.ConsoleAppender$SystemOutStream.write(ConsoleAppender.java:338)
	at java.io.PrintStream.write(PrintStream.java:480)
	- locked <0x000000008031cf78> (a java.io.PrintStream)
	at org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.java:53)
	at org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:262)
	- eliminated <0x000000008031aea8> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
	at org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager.java:294)
	- eliminated <0x000000008031aea8> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
	at org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:303)
	- locked <0x000000008031aea8> (a org.apache.logging.log4j.core.appender.OutputStreamManager)
	at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:179)
	at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:170)
	at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:161)
	at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
	at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
	at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
	at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
	at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:448)
	at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:433)
	at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417)
	at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:403)
	at org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63)
	at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:146)
	at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2163)
	at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2118)
	at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2101)
	at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2000)
	at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1859)
	at org.apache.logging.slf4j.Log4jLogger.debug(Log4jLogger.java:119)
	at com.ll.hiws.warning.service.impl.WarningResultServiceImpl.InfectCalculateFromResult(WarningResultServiceImpl.java:410)
	at com.ll.hiws.warning.service.impl.WarningResultServiceImpl.warningCalculate(WarningResultServiceImpl.java:179)
	at com.ll.hiws.warning.service.impl.WarningResultServiceImpl.InfectCalculate(WarningResultServiceImpl.java:103)
	at com.ll.hiws.warning.service.impl.WarningResultServiceImpl$$FastClassBySpringCGLIB$$3f8499af.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:747)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
	at org.springframework.transaction.interceptor.TransactionInterceptor$$Lambda$165/1989184704.proceedWithInvocation(Unknown Source)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
	at com.ll.hiws.warning.service.impl.WarningResultServiceImpl$$EnhancerBySpringCGLIB$$57158dfd.InfectCalculate(<generated>)
	at com.ll.hiws.HiwsApplicationRunner.run(HiwsApplicationRunner.java:70)
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:781)
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:771)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:335)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1246)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1234)
	at com.ll.hiws.HiwsWarningApplication.main(HiwsWarningApplication.java:11)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
	at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)

"VM Thread" os_prio=0 tid=0x00007f451c271000 nid=0x83 runnable 

首先可以看到:springboot使用的数据库连接池HikariPool线程处于BLOCKED状态,等待锁 lock <0x000000008031aea8> 释放,

SpringBoot-Log4j2组件引起阻塞hung住问题排查一例

是log4j的输出管理器持有这个锁,

之后的信息也可以看到,日志输出管理器,加了这个锁

SpringBoot-Log4j2组件引起阻塞hung住问题排查一例

main程序持有了锁,数据库连接池等待锁,main程序输出完之后把 锁释放了,数据库连接池就能接着运行了,一切都看起来很正常,而且main程序是RUNNABLE,处于可运行状态,在等待操作系统的资源,没毛病,一切都看起来很正常。

问题就出在,过了会我又导出了一个线程信息,发现一模一样,程序一直没往下执行啊!所以就是main程序的writeBytes一直在等待?!

在jira上找到这样一个回答:ConsoleAppender hangs when writing to System.out in a spawned JVM

有点底层,没整明白,大概意思是 要把ConsoleAppender默认为false的follow设置为true,就解决问题了。

后来我整理了好多资料,这里大概整理下我的理解:

记录日志的时候,如果往控制台打印输出日志的话,会把日志写入缓存,控制台会从缓存中取,但比如控制台没取,比如在IDEA中运行的时候,用鼠标选中控制台,这时候控制台会暂停输出,不从缓存中取东西,缓存内容就不会清,日志程序

会一直往里写,直到写满,线程就会停止写入,等待缓存可用,表现在程序里,就是writeBytes函数不返回,持有的锁不释放,程序就hung住了。

部署在docker中的时候,因为docker容器会一直获取标准输出的内容,自己记录docker日志,但是当缓存中的东西比较多的时候,比如日志长度特别长,docker没办法及时清空缓存,也会导致log4j出现这个问题。但是据我找到的资料,这个问题可以通过升级DOCKER版本得到解决。

等等,往控制台输出的日志会出现这个问题,要是不往控制台输出日志呢?

所以我这里整理下解决方案:

1、在使用 ConsoleAppender 的时候把follow属性设置为true

2、(笔者未验证)改用别的组件,在log4j2框架下,额外引入disruptor,参考:博客园-log4j输出到控制台的性能问题

3、运行时,配置log4j不往控制台输出日志,都写入到文件中,笔者程序正常运行了

4、(笔者未验证)升级docker版本至 18.06 参考:Logging long lines breaks container

 

这里列一下笔者找到的相关连接:

1、ConsoleAppender hangs when writing to System.out in a spawned JVM

2、OutputStreamManager in ConsoleAppender leaking managers

3、Deadlock in log4j 2.10.0

4、Deadlock with logging

5、Logging long lines breaks container

6、A RUNNABLE state thread hang s on the java.io.FileOutputStream.writeBytes method

7、容器打印日志到控制台阻塞的排障

上一篇:Spring boot-日志


下一篇:log4j2.xml