当前位置:网站首页>A fullGC problem troubleshooting caused by groovy

A fullGC problem troubleshooting caused by groovy

2022-08-10 20:26:00 InfoQ

一、问题背景
二、分析过程
  • 2.1 参数配置
  • 2.2 定位过程
  • 2.3 JVM分析
  • 2.4 问题分析
三、解决方案

一、问题背景

prometheusAfter the monitoring alarm takes effect,A service every morning 8-12 There will be in betweenfullGC的报警;
Troubleshoot and resolve the issue;

二、分析过程

2.1 参数配置
JVM 参数配置如下:
-Xms3g -Xmx3g -Xmn1g -XX:MetaspaceSize=128m -XX:ParallelGCThreads=5 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=80 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError 
新生代大小:1G;
新生代垃圾收集器:ParNewGC;
老年代大小:2G;
老年代垃圾收集器:ConcMarkSweepGC;
CMS触发条件:老年代内存占用达到80%及以上;
2.2 定位问题
1.Because the time point of the alarm is concentrated in the morning 8-12 点之间,Suspected to be caused by a certain timed task;
2.Locate specific timed tasks,The time settings for two timed tasks are basically satisfied;
任务1: 更新客户信息  
CustomerScheduleJobService.updateCustomerDataDaily  0 0/30 8,9,10,11,12 * * ?

任务2: Create customer tasks   
CustomerStaffScheduleJobService.jobCreateTask  0 10,40 7,8,9,10,11 * * ?
3.确定具体的任务
Two lines of confirmation:
1.Check the execution time of the scheduled task through the log;
2.将2Each timed task specifies a different machine to perform the observation;
Check the task execution time:
任务1 : 很快,Almost no business logic is handled;
[03-09 08:00:00 062 INFO ] [] [] [] [customerDataStat-pool-0] bll.customer.CustomerUpdateInfoDailyBll - (123) logid=6907112718471909376 [BizCustomerBll.updateCustomerDataDaily] thread begin...Ip: 10.151.49.157
[03-09 08:01:25 476 INFO ] [] [] [] [customerDataStat-pool-0] bll.customer.CustomerUpdateInfoDailyBll - (125) logid=6907112718471909376 [BizCustomerBll.updateCustomerDataDaily] end total=0Ip: 10.151.49.157
任务2: 执行约35分钟时间;
8:10分开始,8:45分结束;
[03-09 08:45:08 458 INFO ] [] [] [] [pool-4-thread-20] bll.task.CreateCustomerTaskBll - (109) logid=6907115234995589120 method=jobCreateTask msg=end queryRuleNum=7 queryCustomerNum=15962 createTaskCustomerNum=238 createTaskCount=271Ip: 10.151.49.157
Basically determined to be caused by the second timed taskFullGC;
2.3 JVM分析
2.3.1 Monitoring chart for a single day
Memory trends
null
GC趋势
null
2.3.2 Alarm time period monitoring chart
Memory trends
null
GC趋势
null
2.3.3 图表分析
2.3.3.1 Old age changes
现象
1.任务执行过程中:There is a marked increase in the old age,并且FullGCThere was no significant decline after that,Only a slight drop;
2.任务执行结束后:The next task starts executing,进行FullGC后,will drop to the same level as other machines,Even the memory footprint is lower;
备注
Several situations from the new generation to the old generation
1:大对象;
2:old enough,cms没有设置,默认是6,通过jinfo确认也是6;
3:suvivor区不足以存放YGC后的存活对象,Directly use the guarantee strategy to promote to the old age;
分析
任务执行过程中,YGC平均1分钟执行5次,Many subjects will reach the maximum promotion age6,晋升到老年代;
And since the mission is not over,对象还有引用,所以FullGCThere was no significant decrease after that;
After the last mission ended,The old age is not like thatsuvivorThe same area has a low memory footprint for a period of time,Mainly it will not trigger a new one until the next mission startsFullGC,触发后,Objects in the old generation are no longer referenced after the task ends,So it will be recycled normally;
2.3.3.2 survivorDistrict change
suvivorarea memory in total100M,任务执行过程中,平均占用 80M;It will soar when high90以上,所以这个过程中YGCalso become frequent,平均1分钟5次;
2.3.3.3 非堆内存/方法区/compressed class cach变化
使用 jstat Separate statistics for the two machinesgc统计,两者最大的区别在于 A machine that has executed scheduled tasksMC(方法区大小) 以及 CCSC(压缩类空间大小) Significantly higher than a machine that has not performed timed tasks;
null
During task execution, the memory usage of the method area will be consistent with the curve of the old generation,The recycling of these districts also depends on the old age,这个通过grafanaThe monitoring chart of the platform can also be seen;
2.3.3.4 dump文件分析
null
groovyThe proportion of related classes57.57%;
2.4 参数配置
java 与 groovy 版本
java version "1.8.0_191"
    
<dependency>
&nbsp;&nbsp;&nbsp;&nbsp;<groupId>org.codehaus.groovy</groupId>
&nbsp;&nbsp;&nbsp;&nbsp;<artifactId>groovy-all</artifactId>
&nbsp;&nbsp;&nbsp;&nbsp;<version>2.4.15</version>
</dependency>
代码中使用到groovy的地方:The same timed task,When sending a task,The expression checks whether the delivery conditions are met,表达式是用groovy进行处理的;
public&nbsp;class&nbsp;GroovyShellUtils&nbsp;{

&nbsp;&nbsp;&nbsp;&nbsp;private&nbsp;static&nbsp;LoggerHelper&nbsp;logger&nbsp;=&nbsp;LoggerHelper.getLoggerHelper(GroovyShellUtils.class);

&nbsp;&nbsp;&nbsp;&nbsp;public&nbsp;static&nbsp;boolean&nbsp;explain(String&nbsp;scriptText)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;GroovyShell&nbsp;groovyShell&nbsp;=&nbsp;new&nbsp;GroovyShell();
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Object&nbsp;evaluate&nbsp;=&nbsp;groovyShell.evaluate(scriptText);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;(boolean)&nbsp;evaluate;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;catch&nbsp;(Exception&nbsp;e)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;logger.error(&quot;&quot;,&nbsp;e);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;false;
&nbsp;&nbsp;&nbsp;&nbsp;}
}


//&nbsp;使用:

for&nbsp;(String&nbsp;rule&nbsp;:&nbsp;rules)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;boolean&nbsp;res&nbsp;=&nbsp;GroovyShellUtils.explain(rule);
}

Basically the problem can be locatedgroovywhere the script is loaded,groovyUnreasonable use will result,Many new classes are generated dynamically,使得metaspaceare constantly occupied;
class 对象在 1.8 及以后存放在 metaspace 中,也就是堆外内存.
groovy每执行一次,Will dynamically load the incoming text into a script class,When the input parameter is text,The generated filename contains an auto-incrementing value,That is, a new class is dynamically generated every time it is executed,1个用户7A task rule check * 15962个用户 = 111734个
protected&nbsp;synchronized&nbsp;String&nbsp;generateScriptName()&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;return&nbsp;&quot;Script&quot;&nbsp;+&nbsp;(++counter)&nbsp;+&nbsp;&quot;.groovy&quot;;
}
GroovyShell 在内部,它使用groovy.lang.GroovyClassLoader,This is the core of compiling and loading classes at runtime.
GroovyClassLoader Keep references to all the classes it creates,而 class 对象只有在被加载的 classloader 被回收的时候才会被回收,Therefore, it is easy to cause memory leaks;
综上分析,groovy 错误的使用方式导致 class 对象常驻堆外内存且随着调用频率增长.

三、解决方案

1、每个脚本共用一个 GroovyShell 对象,不能使用 for 的方式,The loop is created using;
2、每次执行完释放对象 shell.getClassLoader().clearCache();
转转研发中心及业界小伙伴们的技术学习交流平台,定期分享一线的实战经验及业界前沿的技术话题.关注公众号「转转技术」(综合性)、「大转转FE」(专注于FE)、「转转QA」(专注于QA),更多干货实践,欢迎交流分享~
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/222/202208102001180551.html