用ThreadLocal解决SimpleDateFormat的性能问题

SimpleDateFormat是非线程安全的,在并发下碰到的案例见过好几次了。若场景对性能要求很高,创建大量生命周期很短暂的实例对象也是不小开销(主要是SimpleDateFormat内部比较复杂),可以考虑采用ThreadLocal来优化,以前曾见到过这种优化方式,忘了出处,在tomcat的代码里也有这样的用法,下面代码是tomcat7的DateTool工具类里的:

    public static final ThreadLocal<DateFormat> rfc1123Format = new ThreadLocal<DateFormat>() {
    @Override
    public DateFormat initialValue() {
        DateFormat result = new SimpleDateFormat(RFC1123_PATTERN, Locale.US);
        result.setTimeZone(GMT_ZONE);
        return result;
    }
};

调用的时候,每个线程可以通过rfc1123Format.get().format(...) 来使用。不过tomcat8里已经删除了这个类。

使用strace定位jvm退出的原因范围

今天遇到的一个tomcat启动过程中jvm退出的问题,不是jvm crash的情况,用户日志配置的不正确导致一些信息没有展现出来,只看到pandora执行了shutdownhook的信息。这可能是启动时的逻辑有触发System.exit,或被系统或人为kill掉了。

根据以往的经验,排除了oom killer或ulimit -t设置不当导致被内核给kill掉的情况,OS级别的signal通常不留机会给jvm执行shutdownhook的。如此一来singal的范围应该就是SIGTERM, SIGINT, SIGHUP这3种(参考这里)。

虽然singal范围缩小,但依然不能确定是因为代码里调用了System.exit还是人为(或被其他进程)kill引起的。直接上大招用systemtap需要安装kernal debuginfo,没有权限的话,还要找到对应的人去做;如果现象较容易重现的话,可以先通过strace命令进一步缩小问题的范围,究竟是因为jvm内部执行了System.exit还是外界的kill引起的。

这里通过启动一个scala的repl来模拟java进程,通过strace attach到jvm进程上,然后观察,如果是外界的kill所致,可以看到下面的信息:

$ sudo strace -p 1947
Process 1947 attached - interrupt to quit
futex(0x7fb7635959d0, FUTEX_WAIT, 1948, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
futex(0x7fb762762360, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigreturn(0x7fb762762360)            = 202
futex(0x7fb7635959d0, FUTEX_WAIT, 1948, NULLPANIC: attached pid 1947 exited with 143
 <unfinished ... exit status 143>

里面的关键信息是SIGTERMexit status 143(即SIGTERM的code)

如果是kill -2或ctrl-c终止repl,可以看到有关SIGINT的信息

$ sudo strace -p 1813
Process 1813 attached - interrupt to quit
futex(0x7fb24d15a9d0, FUTEX_WAIT, 1814, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGINT (Interrupt) @ 0 (0) ---
futex(0x7fb24c327360, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigreturn(0x7fb24c327360)            = 202
futex(0x7fb24d15a9d0, FUTEX_WAIT, 1814, NULLPANIC: attached pid 1813 exited with 130
 <unfinished ... exit status 130>

如果是jvm自身执行了System.exit比如:

scala> System.exit(0)

那么在跟踪的信息里,是看不到signal的:

$ sudo strace -p 2131
Process 2131 attached - interrupt to quit
futex(0x7fc14adb49d0, FUTEX_WAIT, 2132, NULLPANIC: attached pid 2131 exited with 0
 <unfinished ... exit status 0>

至此我们可以判断出到底是外部还是内部引起的了,如果是内部就不必麻烦Systemtap了,可以从源码去找。

tomcat关闭应用时的清理工作(3): ThreadLocal

tomcat在关闭时,可能看到与ThreadLocal相关的警告:

Jan 24, 2014 7:18:52 AM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks
SEVERE: The web application [] created a ThreadLocal with key of type 
[java.lang.ThreadLocal] (value [java.lang.ThreadLocal@4e61bc49]) and a value of type 
[com.alibaba.xxx.Entry] (value [...]) but failed to remove it when the web application was stopped. 
Threads are going to be renewed over time to try and avoid a probable memory leak.

用下面的例子来模拟一下

@WebServlet(name = "MainServlet", urlPatterns = { "/main" }, loadOnStartup = 1)
public class MainServlet extends HttpServlet {

    private static final long serialVersionUID = 1L;

    private static ThreadLocal<MyClass> tl = new ThreadLocal<MyClass>();

    public void doGet(HttpServletRequest req, HttpServletResponse resp)
            throws ServletException, IOException {

        MyClass m = tl.get();
        if (m == null) {
            tl.set(new MyClass());
        }
    }

}

class MyClass {}

请求一次之后,通过脚本会命令停止tomcat,会看到类似的日志:

七月 16, 2014 5:01:35 下午 org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks
严重: The web application [] created a ThreadLocal with key of type [java.lang.ThreadLocal] 
(value [java.lang.ThreadLocal@7da150]) and a value of type [org.r113.servlet3.MyClass] 
(value [org.r113.servlet3.MyClass@37e98b70]) but failed to remove it when the web application was stopped. 
Threads are going to be renewed over time to try and avoid a probable memory leak.

这个泄露其实是可能造成classloader的泄露,因为ThreadLocal引用了自定义的类MyClass,绑定到了当前的请求线程上,而请求线程又是线程池里的线程,生存周期可能会比较长。比如上面模拟的情况,要停止应用的时候,请求线程的ThreadLocal仍未释放,那么即使加载MyClass类的classLoader已经不会再被任何地方使用,可以被垃圾回收了,却因为这个MyClass被引用而得不到回收。

tomcat关闭应用时的清理工作(2): 线程的清理

tomcat在关闭时,有时会看到类似下面的警告信息:

2014-7-10 13:44:02 org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named 
[com.taobao.xxx.client.Timer] but has failed to stop it. 
This is very likely to create a memory leak.

这是tomcat关闭应用时检测到了应用启动的线程未被终止,tomcat为防止造成内存泄露,给出上面的警告,并根据配置来决定是否强制停止该线程(默认不会强制停止)。

有时也会有另一种相似的警告信息:

2014-7-10 13:44:02 org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] is still processing a request that has yet to finish. 
This is very likely to create a memory leak. 
You can control the time allowed for requests to finish 
by using the unloadDelay attribute of the standard Context implementation.

这是tomcat关闭应用时检测到了仍有请求线程未处理完。

上面的2种警告都是在WebappClassLoaderclearReferencesThreads方法里给出的,该方法也是在stop时调用clearReferences方法时调用的:

protected void clearReferences() {
    ...
    // Stop any threads the web application started
    clearReferencesThreads();
    ...
}

clearReferencesThreads方法里,通过找到最顶层的root thread group获取所有的active线程,然后判断这些线程如果是用户线程的话,给出警告:

if (isRequestThread(thread)) {
    log.error(sm.getString("webappClassLoader.warnRequestThread",
                            contextName, thread.getName()));
} else {
    log.error(sm.getString("webappClassLoader.warnThread",
                            contextName, thread.getName()));
}

我们来模拟一下,先看第一种情况:

@WebServlet(name = "MainServlet", urlPatterns = { "/main" }, loadOnStartup = 1)
public class MainServlet extends HttpServlet {

    private static final long serialVersionUID = 1L;

    public void init() {
        new Thread() {
            public void run() {
              try {
                while (true) {
                    Thread.sleep(1000);
                }
              } catch (Exception e) {
              }
            }
        }.start();
    }
}

在一个servlet初始化时启动了一个线程,没有提供销毁这个线程的机制,当tomcat停止时,会报第一种警告。

再模拟第二种警告情况,在请求时将线程hang住:

public void doGet(HttpServletRequest req, HttpServletResponse resp)
        throws ServletException, IOException {
    try {
        Thread.sleep(1000 * 200);
    } catch (Exception e) {
    }
}

这时关闭tomcat时会报出第二种警告信息。

默认tomcat并不会强制将这些线程终止,除非设置了clearReferencesStopThreads为true,它判断线程属于某个线程池则延迟一段时间将线程终止,否则直接调用了JDK已不鼓励的Thread.stop方法终止线程。

if (usingExecutor) {
    // Executor may take a short time to stop all the
    // threads. Make a note of threads that should be
    // stopped and check them at the end of the method.
    executorThreadsToStop.add(thread);
} else {
    // This method is deprecated and for good reason. This
    // is very risky code but is the only option at this
    // point. A *very* good reason for apps to do this
    // clean-up themselves.
    thread.stop();
}