收到CPU 300多告警,4核CPU基本跑满。
1.输出线程栈信息
sudo -u admin /opt/taobao/java/bin/jstack -l 818 > jstack.out
2.查看占用CPU的线程id
top -Hp 818
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1956 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:31.82 java
2054 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:33.48 java
2280 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:31.37 java
2312 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:30.70 java
2474 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:31.31 java
2482 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:34.57 java
2531 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:37.95 java
2559 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:30.95 java
2569 admin 20 0 6696m 3.2g 16m R 3.0 40.0 2:29.35 java
1962 admin 20 0 6696m 3.2g 16m R 2.7 40.0 2:32.71 java
2053 admin 20 0 6696m 3.2g 16m R 2.7 40.0 2:34.13 java
2246 admin 20 0 6696m 3.2g 16m R 2.7 40.0 2:33.25 java
2258 admin 20 0 6696m 3.2g 16m R 2.7 40.0 2:35.16 java
3.转换上面的线程pid为16进制,到jstack中通过nid查找并查看线程栈信息
## 1926 -> 0x7a4
grep -A100 0x7a4 jstack.out
"http-apr-7001-exec-1" daemon prio=10 tid=0x00007f698092f800 nid=0x7a4 runnable [0x0000000053445000]
java.lang.Thread.State: RUNNABLE
at java.util.HashMap.get(HashMap.java:303)
at org.apache.velocity.util.introspection.ClassMap$MethodCache.get(ClassMap.java:248)
at org.apache.velocity.util.introspection.ClassMap.findMethod(ClassMap.java:102)
at org.apache.velocity.util.introspection.IntrospectorBase.getMethod(IntrospectorBase.java:105)
at org.apache.velocity.util.introspection.Introspector.getMethod(Introspector.java:94)
at org.apache.velocity.runtime.parser.node.BooleanPropertyExecutor.discover(BooleanPropertyExecutor.java:78)
at org.apache.velocity.runtime.parser.node.PropertyExecutor.<init>(PropertyExecutor.java:56)
at org.apache.velocity.runtime.parser.node.BooleanPropertyExecutor.<init>(BooleanPropertyExecutor.java:53)
at com.alibaba.citrus.service.velocity.impl.CustomizedUberspectImpl.getPropertyGet(CustomizedUberspectImpl.java:57)
at org.apache.velocity.runtime.parser.node.ASTIdentifier.execute(ASTIdentifier.java:148)
at org.apache.velocity.runtime.parser.node.ASTReference.execute(ASTReference.java:252)
at org.apache.velocity.runtime.parser.node.ASTReference.render(ASTReference.java:339)
at org.apache.velocity.runtime.parser.node.ASTBlock.render(ASTBlock.java:72)
at org.apache.velocity.runtime.parser.node.ASTIfStatement.render(ASTIfStatement.java:87)
at org.apache.velocity.runtime.parser.node.ASTBlock.render(ASTBlock.java:72)
at org.apache.velocity.runtime.directive.Foreach.render(Foreach.java:364)
at org.apache.velocity.runtime.parser.node.ASTDirective.render(ASTDirective.java:176)
at org.apache.velocity.runtime.parser.node.ASTBlock.render(ASTBlock.java:72)
at org.apache.velocity.runtime.parser.node.ASTIfStatement.render(ASTIfStatement.java:87)
at org.apache.velocity.runtime.parser.node.SimpleNode.render(SimpleNode.java:336)
at org.apache.velocity.Template.merge(Template.java:328)
at org.apache.velocity.Template.merge(Template.java:235)
at com.alibaba.citrus.service.velocity.impl.VelocityEngineImpl.mergeTemplate(VelocityEngineImpl.java:185)
at com.alibaba.citrus.service.velocity.impl.VelocityEngineImpl.mergeTemplate(VelocityEngineImpl.java:169)
at com.alibaba.citrus.service.velocity.impl.VelocityEngineImpl.writeTo(VelocityEngineImpl.java:129)
at com.alibaba.citrus.service.template.impl.TemplateServiceImpl.writeTo(TemplateServiceImpl.java:227)
4.问题定位
GG了一下,multi-thread merge causes 100% cpu :https://issues.apache.org/jira/browse/VELOCITY-718
velocity 1.6.1及以下版本Template.merge使用了非线程安全的HashMap做为自省的Method缓存。
导致了ReHash并发死循环问题:http://coolshell.cn/articles/9606.html
总结
1.系统运行了一年半,第一次出现这个问题,线程问题就在那里,冷不丁就触发了。
2.万恶的间接依赖,之前velocity间接依赖的1.6版本,升级到1.6.4。