线程池使用不当,dubbo线程池被打爆

1.线上告警

[org.apache.dubbo.common.threadpool.support.AbortPolicyWithReport] [rejectedExecution] [NettyServerWorker-13-3] [] []  ---   [DUBBO] Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-10.11.99.213:10393, Pool Size: 200 (active: 200, core: 200, max: 200, largest: 200), Task: 5199 (completed: 4999), Executor status:(isShutdown:false, isTerminated:false, isTerminating:false)

2.dump线程信息

发现非常多的线程处于等待状态

"xxx-sync-pool-1-thread-1353" #2091 prio=5 os_prio=0 tid=0x00007f32d0033000 nid=0x85f waiting on condition [0x00007f329ac3d000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000003b7c24680> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
    at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1947)
    at cn.xxx.jc.pbets.biz.shared.service.bet.impl.xxxxx.lambda$initEvalCache$54(EvaluationResultServiceImpl.java:829)
    at cn.xxx.jc.pbets.biz.shared.service.bet.impl.xxxxx$$Lambda$2953/1878309232.get(Unknown Source)
    at cn.xxx.jc.pbets.common.util.DistributedLockUtils.lock(DistributedLockUtils.java:42)
    at cn.xxx.jc.pbets.biz.shared.service.bet.impl.xxxxx.initEvalCache(EvaluationResultServiceImpl.java:724)
    at cn.xxx.jc.pbets.biz.shared.service.bet.impl.xxxxx.lambda$null$10(EvaluationResultServiceImpl.java:345)
    at cn.xxx.jc.pbets.biz.shared.service.bet.impl.xxxxx$$Lambda$2952/1040750602.run(Unknown Source)
    at org.apache.skywalking.apm.plugin.wrapper.SwRunnableWrapper.run(SwRunnableWrapper.java:43)
    at org.apache.skywalking.apm.plugin.transmittable.thread.local.v2x.wrapper.SwRunnableWrapper.run(SwRunnableWrapper.java:30)
    at com.alibaba.ttl.TtlRunnable.run(TtlRunnable.java:60)
    at cn.xxx.jc2.common.util.thread.ThreadMdcUtil.lambda$wrap$3(ThreadMdcUtil.java:72)
    at cn.xxx.jc2.common.util.thread.ThreadMdcUtil$$Lambda$2598/832064415.run(Unknown Source)
    at org.apache.skywalking.apm.toolkit.trace.RunnableWrapper.run$original$VFOzA03Q(RunnableWrapper.java:34)
    at org.apache.skywalking.apm.toolkit.trace.RunnableWrapper.run$original$VFOzA03Q$accessor$q4svxsz0(RunnableWrapper.java)
    at org.apache.skywalking.apm.toolkit.trace.RunnableWrapper$auxiliary$wVM5ggPv.call(Unknown Source)
    at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:86)
    at org.apache.skywalking.apm.toolkit.trace.RunnableWrapper.run(RunnableWrapper.java)
    at org.apache.skywalking.apm.plugin.wrapper.SwRunnableWrapper.run(SwRunnableWrapper.java:43)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

3.从链路里查看使用的线程池配置

image.png

从配置里可以看到线程池核心数12,最大数50,队列1000
而dump里线程都创建到1353了,说明等待任务到过1000,而且创建了触发了最大线程的创建和销毁


image.png

image.png

业务代码里非常多的异步处理


image.png

因此怀疑因为dubbo线程的业务代码,因为并行处理的任务太多,导致更多的任务进了队列在排队等候处理,即使dubbo线程池是200个线程,但是每个所有dubbo请求到业务代码里,都只有12个线程在处理,导致大量dubbo请求在等待,宽进窄出

4.优化

线程池核心数设置为100,最大数设置为150,队列设置为100,因为绝大部分的异步处理,都是在调别的服务获取数据等,io密集型处理
发布观察,问题得到解决

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容