Spring之ShutDown Hook死锁现象解读

2023-05-14 08:05:46 死锁 现象 解读

Spring ShutDown Hook死锁现象

偶然出现一次项目异常spring却没有正常停止的情况,最终发现是Spring Shutdown导致的死锁现象。

某个框架里嵌入了类似这样的一段代码

@Component
public class ShutDownHookTest implements ApplicationListener<ContextRefreshedEvent> {
    @Override
    public void onApplicationEvent(ContextRefreshedEvent event) {
        if (onException) {
		    System.out.println("test shutdown hook deadlock");
			System.exit(0);
		}
 
    }
}

它的逻辑就是想要在出现异常后,通过System.exit来确保应用程序退出。

而且没有使用异步事件,是在主线程下跑了System.exit,然后就发现SpringBoot server还是正常运行着的。

而且程序看着好像也没问题,由于我们是dubbo服务化系统,在测试环境上服务还是正常的。

这很明显不符常理,正常来说,System.exit这样的指令是spring能够感知到的,并且会执行shutDown处理的,先来看看Spring 注册ShutdownHook

public abstract class AbstractApplicationContext extends DefaultResourceLoader
		implements ConfigurableApplicationContext {
	@Override
	public void reGISterShutdownHook() {
		if (this.shutdownHook == null) {
			// No shutdown hook registered yet.
			this.shutdownHook = new Thread(SHUTDOWN_HOOK_THREAD_NAME) {
				@Override
				public void run() {
				    //重点在这里获取startupShutdownMonitor的监视器锁
					synchronized (startupShutdownMonitor) {
						doClose();
					}
				}
			};
			Runtime.getRuntime().addShutdownHook(this.shutdownHook);
		}
	}
 
	protected void doClose() {
		// Check whether an actual close attempt is necessary...
		if (this.active.get() && this.closed.compareAndSet(false, true)) {
			if (logger.isDebugEnabled()) {
				logger.debug("Closing " + this);
			}
 
			if (!NativeDetector.inNativeImage()) {
				LiveBeansView.unregisterApplicationContext(this);
			}
 
			try {
				// Publish shutdown event.
				publishEvent(new ContextClosedEvent(this));
			}
			catch (Throwable ex) {
				logger.warn("Exception thrown from ApplicationListener handling ContextClosedEvent", ex);
			}
 
			// Stop all Lifecycle beans, to avoid delays during individual destruction.
			if (this.lifecycleProcessor != null) {
				try {
					this.lifecycleProcessor.onClose();
				}
				catch (Throwable ex) {
					logger.warn("Exception thrown from LifecycleProcessor on context close", ex);
				}
			}
 
			// Destroy all cached singletons in the context's BeanFactory.
			destroyBeans();
 
			// Close the state of this context itself.
			closeBeanFactory();
 
			// Let subclasses do some final clean-up if they wish...
			onClose();
 
			// Reset local application listeners to pre-refresh state.
			if (this.earlyApplicationListeners != null) {
				this.applicationListeners.clear();
				this.applicationListeners.addAll(this.earlyApplicationListeners);
			}
 
			// Switch to inactive.
			this.active.set(false);
		}
	}
}	

也就是说spring新起了一个线程,加入了JVM Shutdown钩子函数。

重点是close前要获取startupShutdownMonitor的对象监视器锁,这个锁看着就很眼熟,Spring在refresh时也会获取这把锁。

public abstract class AbstractApplicationContext extends DefaultResourceLoader
		implements ConfigurableApplicationContext {
 
	@Override
	public void refresh() throws BeansException, IllegalStateException {
		synchronized (this.startupShutdownMonitor) {
			StartupStep contextRefresh = this.applicationStartup.start("spring.context.refresh");
 
			// Prepare this context for refreshing.
			prepareRefresh();
 
			// Tell the subclass to refresh the internal bean factory.
			ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory();
 
			// Prepare the bean factory for use in this context.
			prepareBeanFactory(beanFactory);
 
			......
		}
	}
}	

这个时候我们猜想,是获取startupShutdownMonitor死锁了。

jstack打下线程栈看看

 "SprinGContextShutdownHook" #18 prio=5 os_prio=0 tid=0x0000000024e00800 nid=0x407c waiting for monitor entry [0x000000002921f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:991)
        - waiting to lock <0x00000006c494f430> (a java.lang.Object)

"main" #1 prio=5 os_prio=0 tid=0x0000000002de4000 nid=0x1ff4 in Object.wait() [0x0000000002dde000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
        at java.lang.Thread.join(Thread.java:1252)
        - locked <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
        at java.lang.Thread.join(Thread.java:1326)
        at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:107)
        at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
        at java.lang.Shutdown.runHooks(Shutdown.java:123)
        at java.lang.Shutdown.sequence(Shutdown.java:167)
        at java.lang.Shutdown.exit(Shutdown.java:212)
        - locked <0x00000006c4845128> (a java.lang.Class for java.lang.Shutdown)
        at java.lang.Runtime.exit(Runtime.java:109)
        at java.lang.System.exit(System.java:971)
        at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:12)
        at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:7)
        at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:176)
        at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:169)
        at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:143)
        at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)
        at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)
        at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)
        at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
        - locked <0x00000006c494f430> (a java.lang.Object)
        at org.springframework.boot.WEB.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:144)
        at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:771)
        at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:763)

乍一看jstack并没有提示线程死锁(jvisualvm、jconsle之类的工具也不行),但是从线程栈来看:

  • main线程先获取到了startupShutdownMonitor锁 <0x00000006c494f430>
  • SpringContextShutdownHook线程在等待startupShutdownMonitor锁
  • main线程掉了Thread.join阻塞在获取<0x00000006c4a43118>这把锁

根本原因是main线程调System.exit阻塞住了,一直往下追踪,会发现阻塞在ApplicationShutdownHooks这里

class ApplicationShutdownHooks {
    
    static void runHooks() {
        Collection<Thread> threads;
        synchronized(ApplicationShutdownHooks.class) {
            threads = hooks.keySet();
            hooks = null;
        }
 
        for (Thread hook : threads) {
            hook.start();
        }
        for (Thread hook : threads) {
            while (true) {
                try {
					// 等待shutdow线程结束
                    hook.join();
                    break;
                } catch (InterruptedException ignored) {
                }
            }
        }
    }
}

总结

整个死锁的流程:

  • main线程-spring refresh开始时会获取startupShutdownMonitor对象监视器锁
  • main线程-在spring refresh还未完成的时候,触发了System.exit指令
  • SpringContextShutdownHook线程-SpringContextShutdownHook线程开始工作,等待获取startupShutdownMonitor对象监视器锁
  • main线程调用Thread.join等待SpringContextShutdownHook线程结束

所以在Spring未完成refresh时,是不能够触发System.exit指令的

以上为个人经验,希望能给大家一个参考,也希望大家多多支持。

相关文章