26 May 2015

收到CPU 300多告警,4核CPU基本跑满。

1.输出线程栈信息

sudo -u admin /opt/taobao/java/bin/jstack -l 818 > jstack.out

2.查看占用CPU的线程id

top -Hp 818
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
  1956 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:31.82 java    
  2054 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:33.48 java    
  2280 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:31.37 java    
  2312 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:30.70 java    
  2474 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:31.31 java    
  2482 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:34.57 java    
  2531 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:37.95 java    
  2559 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:30.95 java    
  2569 admin     20   0 6696m 3.2g  16m R  3.0 40.0   2:29.35 java    
  1962 admin     20   0 6696m 3.2g  16m R  2.7 40.0   2:32.71 java    
  2053 admin     20   0 6696m 3.2g  16m R  2.7 40.0   2:34.13 java    
  2246 admin     20   0 6696m 3.2g  16m R  2.7 40.0   2:33.25 java    
  2258 admin     20   0 6696m 3.2g  16m R  2.7 40.0   2:35.16 java    

3.转换上面的线程pid为16进制,到jstack中通过nid查找并查看线程栈信息

## 1926 -> 0x7a4

grep -A100 0x7a4  jstack.out
"http-apr-7001-exec-1" daemon prio=10 tid=0x00007f698092f800 nid=0x7a4 runnable [0x0000000053445000]
   java.lang.Thread.State: RUNNABLE
        at java.util.HashMap.get(HashMap.java:303)
        at org.apache.velocity.util.introspection.ClassMap$MethodCache.get(ClassMap.java:248)
        at org.apache.velocity.util.introspection.ClassMap.findMethod(ClassMap.java:102)
        at org.apache.velocity.util.introspection.IntrospectorBase.getMethod(IntrospectorBase.java:105)
        at org.apache.velocity.util.introspection.Introspector.getMethod(Introspector.java:94)
        at org.apache.velocity.runtime.parser.node.BooleanPropertyExecutor.discover(BooleanPropertyExecutor.java:78)
        at org.apache.velocity.runtime.parser.node.PropertyExecutor.<init>(PropertyExecutor.java:56)
        at org.apache.velocity.runtime.parser.node.BooleanPropertyExecutor.<init>(BooleanPropertyExecutor.java:53)
        at com.alibaba.citrus.service.velocity.impl.CustomizedUberspectImpl.getPropertyGet(CustomizedUberspectImpl.java:57)
        at org.apache.velocity.runtime.parser.node.ASTIdentifier.execute(ASTIdentifier.java:148)
        at org.apache.velocity.runtime.parser.node.ASTReference.execute(ASTReference.java:252)
        at org.apache.velocity.runtime.parser.node.ASTReference.render(ASTReference.java:339)
        at org.apache.velocity.runtime.parser.node.ASTBlock.render(ASTBlock.java:72)
        at org.apache.velocity.runtime.parser.node.ASTIfStatement.render(ASTIfStatement.java:87)
        at org.apache.velocity.runtime.parser.node.ASTBlock.render(ASTBlock.java:72)
        at org.apache.velocity.runtime.directive.Foreach.render(Foreach.java:364)
        at org.apache.velocity.runtime.parser.node.ASTDirective.render(ASTDirective.java:176)
        at org.apache.velocity.runtime.parser.node.ASTBlock.render(ASTBlock.java:72)
        at org.apache.velocity.runtime.parser.node.ASTIfStatement.render(ASTIfStatement.java:87)
        at org.apache.velocity.runtime.parser.node.SimpleNode.render(SimpleNode.java:336)
        at org.apache.velocity.Template.merge(Template.java:328)
        at org.apache.velocity.Template.merge(Template.java:235)
        at com.alibaba.citrus.service.velocity.impl.VelocityEngineImpl.mergeTemplate(VelocityEngineImpl.java:185)
        at com.alibaba.citrus.service.velocity.impl.VelocityEngineImpl.mergeTemplate(VelocityEngineImpl.java:169)
        at com.alibaba.citrus.service.velocity.impl.VelocityEngineImpl.writeTo(VelocityEngineImpl.java:129)
        at com.alibaba.citrus.service.template.impl.TemplateServiceImpl.writeTo(TemplateServiceImpl.java:227)

4.问题定位

GG了一下,multi-thread merge causes 100% cpu :https://issues.apache.org/jira/browse/VELOCITY-718

velocity 1.6.1及以下版本Template.merge使用了非线程安全的HashMap做为自省的Method缓存。

导致了ReHash并发死循环问题:http://coolshell.cn/articles/9606.html

总结

1.系统运行了一年半,第一次出现这个问题,线程问题就在那里,冷不丁就触发了。

2.万恶的间接依赖,之前velocity间接依赖的1.6版本,升级到1.6.4。