1.概述
由于netty动辄管理100w+的连接,每一个连接都会有很多超时任务。
比如发送超时、心跳检测间隔等,如果每一个定时任务都启动一个`Timer`,不仅低效,而且会消耗大量的资源。
根据George Varghese 和 Tony Lauck 1996 年的论文:
[Hashed and Hierarchical Timing Wheels: data structures to efficiently implement a timer facility]。
提出了一种定时轮的方式来管理和维护大量的`Timer`调度.
2.Netty中的HashedWheelTimer
时间轮其实就是一种环形的数据结构,可以想象成时钟,分成很多格子,
一个格子代码一段时间(这个时间越短,Timer的精度越高)。
并用一个链表报错在该格子上的到期任务,同时一个指针随着时间一格一格转动,
并执行相应格子中的到期任务。任务通过取摸决定放入那个格子。
假设一个格子是1秒,则整个wheel能表示的时间段为8s,假如当前指针指向1,
此时需要调度一个3s后执行的任务,显然应该加入到(1+3=4)的方格中,指针再走3次就可以执行了;
如果任务要在10s后执行,应该等指针走完一个round零2格再执行,
因此应放入3,同时将round(1)保存到任务中。
检查到期任务时应当只执行round为0的,格子上其他任务的round应减1。
与java中的Hashmap很像。其实就是HashMap的哈希拉链算法,
只不过多了指针转动与一些定时处理的逻辑。
所以其相关的操作和HashMap也一致:
>> 添加任务:O(1)
>> 删除/取消任务:O(1)
>> 过期/执行任务:
最差情况为O(n)->也就是当HashMap里面的元素全部hash冲突,退化为一条链表的情况。
平均O(1)->显然,格子越多,每个格子上的链表就越短,这里需要权衡时间与空间。
#多层时间轮
如果任务的时间跨度很大,数量很大,单层的时间轮会造成任务的round很大,
单个格子的链表很长。这时候可以将时间轮分层,类似于时钟的时分秒3层。
多层的时间轮造成的算法复杂度的进一步提升。
单层时间轮只需增加每一轮的格子就能解决链表过长的问题。
因此,更倾向使用单层的时间轮,netty4中时间轮的实现也是单层的。
2.1 源码示例
/**
* A {@link Timer} optimized for approximated I/O timeout scheduling.
*
* <h3>Tick Duration</h3>
*
* 提供了一种非精确的定时任务调度方式, 可以通过调整构造器中 tickDuration 来调整精确度.
* 大多数应用中的 IO 超时任务不需要太精确到某一秒, 因此 tickDuration 默认值是 100ms.
*
* <h3>Ticks per Wheel (Wheel Size)</h3>
* 该类维护了一个叫做 'wheel' 的数据结构, 即hash表, 实际上是一个数组, 默认大小是 512.
*
* <h3>不要创建太多的实例.</h3>
*
* <h3>实现细节</h3>
*
* {@link HashedWheelTimer} is based on
* <a href="http://cseweb.ucsd.edu/users/varghese/">George Varghese</a> and
* Tony Lauck's paper,
* <a href="http://cseweb.ucsd.edu/users/varghese/PAPERS/twheel.ps.Z">'Hashed
* and Hierarchical Timing Wheels: data structures to efficiently implement a
* timer facility'</a>. More comprehensive slides are located
* <a href="http://www.cse.wustl.edu/~cdgill/courses/cs6874/TimingWheels.ppt">here</a>.
*/
public class HashedWheelTimer implements Timer {
static final InternalLogger logger = InternalLoggerFactory.getInstance(HashedWheelTimer.class);
private static final AtomicInteger INSTANCE_COUNTER = new AtomicInteger();
private static final AtomicBoolean WARNED_TOO_MANY_INSTANCES = new AtomicBoolean();
private static final int INSTANCE_COUNT_LIMIT = 64;
private static final long MILLISECOND_NANOS = TimeUnit.MILLISECONDS.toNanos(1);
private static final ResourceLeakDetector<HashedWheelTimer> leakDetector = ResourceLeakDetectorFactory.instance().newResourceLeakDetector(HashedWheelTimer.class, 1);
private static final AtomicIntegerFieldUpdater<HashedWheelTimer> WORKER_STATE_UPDATER = AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimer.class, "workerState");
private final ResourceLeakTracker<HashedWheelTimer> leak;
// 每一个 HashedWheelTimer 中唯一的工作线程
private final Worker worker = new Worker();
private final Thread workerThread;
public static final int WORKER_STATE_INIT = 0;
public static final int WORKER_STATE_STARTED = 1;
public static final int WORKER_STATE_SHUTDOWN = 2;
@SuppressWarnings({ "unused", "FieldMayBeFinal" })
private volatile int workerState; // 0 - init, 1 - started, 2 - shut down
// tick的时长,也就是指针多久转一格, 比如秒针从 3 走到 4.
private final long tickDuration;
// 实际存放定时任务的桶, 即每个格子上的对象, 是个双向链表
private final HashedWheelBucket[] wheel;
// 掩码, 用于计算任务被 hash 到哪一格(tick) 上
private final int mask;
private final CountDownLatch startTimeInitialized = new CountDownLatch(1);
// 存放 timeouts 任务的队列, 使用JCTool–一个高性能的的并发Queue实现包
private final Queue<HashedWheelTimeout> timeouts = PlatformDependent.newMpscQueue();
// 存放已取消的 timeouts 任务的队列
private final Queue<HashedWheelTimeout> cancelledTimeouts = PlatformDependent.newMpscQueue();
// 等待处理的 timeouts 任务的计数
private final AtomicLong pendingTimeouts = new AtomicLong(0);
// 一个 HashedWheelTimer 对象中, 最大的 timeouts 任务数
private final long maxPendingTimeouts;
private volatile long startTime;
public HashedWheelTimer(
ThreadFactory threadFactory,
long tickDuration, // tick的时长,也就是指针多久转一格, 比如秒针从 3 走到 4.
TimeUnit unit,
int ticksPerWheel, // 一圈有几格, 默认值是 512
boolean leakDetection, // 是否开启内存泄露检测
long maxPendingTimeouts) { // 最大等待时间
ObjectUtil.checkNotNull(threadFactory, "threadFactory");
ObjectUtil.checkNotNull(unit, "unit");
ObjectUtil.checkPositive(tickDuration, "tickDuration");
ObjectUtil.checkPositive(ticksPerWheel, "ticksPerWheel");
// 创建时间轮基本的数据结构,一个数组。长度为不小于ticksPerWheel的最小2的n次方
wheel = createWheel(ticksPerWheel);
// 这是一个标示符,用来快速计算任务应该呆的格子。
// 我们知道,给定一个deadline的定时任务,其应该呆的格子=deadline%wheel.length.
// 但是%操作是个相对耗时的操作,所以使用一种变通的位运算代替:
// 因为一圈的长度为2的n次方,mask = 2^n-1后低位将全部是1,然后 deadline&mast == deadline%wheel.length
// java中的HashMap也是使用这种处理方法
mask = wheel.length - 1;
// 转换成纳秒处理
long duration = unit.toNanos(tickDuration);
// Prevent overflow.
if (duration >= Long.MAX_VALUE / wheel.length) {
throw new IllegalArgumentException(String.format("tickDuration: %d (expected: 0 < tickDuration in nanos < %d", tickDuration, Long.MAX_VALUE / wheel.length));
}
if (duration < MILLISECOND_NANOS) {
logger.warn("Configured tickDuration {} smaller then {}, using 1ms.", tickDuration, MILLISECOND_NANOS);
this.tickDuration = MILLISECOND_NANOS;
} else {
this.tickDuration = duration;
}
// 创建worker线程, 一个 HashedWheelTimer 对象, 只有一个线程
workerThread = threadFactory.newThread(worker);
// 这里默认是启动内存泄露检测:当HashedWheelTimer实例超过当前cpu可用核数*4的时候,将发出警告
leak = leakDetection || !workerThread.isDaemon() ? leakDetector.track(this) : null;
this.maxPendingTimeouts = maxPendingTimeouts;
// 如果一个应用进程中 HashedWheelTimer 对象数超过 64, 这里会打印出 warn 日志
if (INSTANCE_COUNTER.incrementAndGet() > INSTANCE_COUNT_LIMIT &&
WARNED_TOO_MANY_INSTANCES.compareAndSet(false, true)) {
reportTooManyInstances();
}
}
@Override
protected void finalize() throws Throwable {
try {
super.finalize();
} finally {
// This object is going to be GCed and it is assumed the ship has sailed to do a proper shutdown. If
// we have not yet shutdown then we want to make sure we decrement the active instance count.
if (WORKER_STATE_UPDATER.getAndSet(this, WORKER_STATE_SHUTDOWN) != WORKER_STATE_SHUTDOWN) {
INSTANCE_COUNTER.decrementAndGet();
}
}
}
// 创建 hash 桶
private static HashedWheelBucket[] createWheel(int ticksPerWheel) {
if (ticksPerWheel <= 0) {
throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
}
if (ticksPerWheel > 1073741824) {
throw new IllegalArgumentException("ticksPerWheel may not be greater than 2^30: " + ticksPerWheel);
}
// 标准化 hash 轮的 size, 变为 2 的 n 次方, 方便进行移位运算, 提高效率
ticksPerWheel = normalizeTicksPerWheel(ticksPerWheel);
HashedWheelBucket[] wheel = new HashedWheelBucket[ticksPerWheel];
for (int i = 0; i < wheel.length; i ++) {
wheel[i] = new HashedWheelBucket();
}
return wheel;
}
// 启动时间轮。这个方法其实不需要显示的主动调用,因为在添加定时任务(newTimeout()方法)的时候会自动调用此方法。
public void start() {
// 判断当前时间轮的状态,如果是初始化,则启动worker线程,启动整个时间轮;如果已经启动则略过;如果是已经停止,则报错
// 这里是一个Lock Free的设计。因为可能有多个线程调用启动方法,这里使用AtomicIntegerFieldUpdater原子的更新时间轮的状态
switch (WORKER_STATE_UPDATER.get(this)) {
case WORKER_STATE_INIT:
if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
workerThread.start();
}
break;
case WORKER_STATE_STARTED:
break;
case WORKER_STATE_SHUTDOWN:
throw new IllegalStateException("cannot be started once stopped");
default:
throw new Error("Invalid WorkerState");
}
// 等待worker线程初始化时间轮的启动时间
while (startTime == 0) {
try {
startTimeInitialized.await();
} catch (InterruptedException ignore) {
// Ignore - it will be ready very soon.
}
}
}
@Override
public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
ObjectUtil.checkNotNull(task, "task");
ObjectUtil.checkNotNull(unit, "unit");
long pendingTimeoutsCount = pendingTimeouts.incrementAndGet();
if (maxPendingTimeouts > 0 && pendingTimeoutsCount > maxPendingTimeouts) {
pendingTimeouts.decrementAndGet();
throw new RejectedExecutionException("Number of pending timeouts ("
+ pendingTimeoutsCount + ") is greater than or equal to maximum allowed pending "
+ "timeouts (" + maxPendingTimeouts + ")");
}
// 如果时间轮没有启动,则启动
start();
// Add the timeout to the timeout queue which will be processed on the next tick.
// During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;
// Guard against overflow.
if (delay > 0 && deadline < 0) {
deadline = Long.MAX_VALUE;
}
// 这里定时任务不是直接加到对应的格子中,而是先加入到一个队列里,然后等到下一个tick的时候,会从队列里取出最多100000个任务加入到指定的格子中
HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
timeouts.add(timeout);
return timeout;
}
private final class Worker implements Runnable {
private final Set<Timeout> unprocessedTimeouts = new HashSet<Timeout>();
private long tick;
@Override
public void run() {
// 初始化startTime.只有所有任务的的deadline都是相对于这个时间点
startTime = System.nanoTime();
if (startTime == 0) {
// 由于System.nanoTime()可能返回0,甚至负数。并且0是一个标示符,用来判断startTime是否被初始化,所以当startTime=0的时候,重新赋值为1
startTime = 1;
}
// 唤醒阻塞在start()的线程
startTimeInitialized.countDown();
// 只要时间轮的状态为WORKER_STATE_STARTED,就循环的“转动”tick,循环判断响应格子中的到期任务
do {
// 计算下次tick的时间, 然后sleep到下次tick
final long deadline = waitForNextTick();
if (deadline > 0) {
// 获取格子对应的索引值
int idx = (int) (tick & mask);
// 处理取消的任务
processCancelledTasks();
// 获取当前格子对应的 timeouts 所在的对象, 如数组 arr[i]
HashedWheelBucket bucket = wheel[idx];
// 将 timeouts 转换入某一个格子中
transferTimeoutsToBuckets();
// 处理当前格子的任务, 每次取出 100000 个
bucket.expireTimeouts(deadline);
// 格子向前转动一格
tick++;
}
} while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);
// 如果上一步死循环停止了, 这里清除掉所有的未处理的任务
// Fill the unprocessedTimeouts so we can return them from stop() method.
for (HashedWheelBucket bucket: wheel) {
bucket.clearTimeouts(unprocessedTimeouts);
}
// 将还没有加入到格子中的待处理定时任务队列中的任务取出,如果是未取消的任务,则加入到未处理任务队列中,以供stop()方法返回
for (;;) {
HashedWheelTimeout timeout = timeouts.poll();
if (timeout == null) {
break;
}
if (!timeout.isCancelled()) {
unprocessedTimeouts.add(timeout);
}
}
// 处理取消的任务
processCancelledTasks();
}
// 将newTimeout()方法中加入到待处理定时任务队列中的任务加入到指定的格子中
private void transferTimeoutsToBuckets() {
// 每次tick只处理10w个任务,以免阻塞worker线程
for (int i = 0; i < 100000; i++) {
HashedWheelTimeout timeout = timeouts.poll();
if (timeout == null) {
// all processed
break;
}
if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) {
// Was cancelled in the meantime.
continue;
}
// 计算任务需要经过多少个tick
long calculated = timeout.deadline / tickDuration;
// 计算任务的轮数
timeout.remainingRounds = (calculated - tick) / wheel.length;
// 如果任务在timeouts队列里面放久了, 以至于已经过了执行时间, 这个时候就使用当前tick, 也就是放到当前bucket, 此方法调用完后就会被执行.
final long ticks = Math.max(calculated, tick); // Ensure we don't schedule for past.
int stopIndex = (int) (ticks & mask);
// 将任务加入到响应的格子中
HashedWheelBucket bucket = wheel[stopIndex];
bucket.addTimeout(timeout);
}
}
// 将取消的任务取出,并从格子中移除
private void processCancelledTasks() {
for (;;) {
HashedWheelTimeout timeout = cancelledTimeouts.poll();
if (timeout == null) {
// all processed
break;
}
try {
timeout.remove();
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn("An exception was thrown while process a cancellation task", t);
}
}
}
}
// sleep, 直到下次tick到来, 然后返回该次tick和启动时间之间的时长
private long waitForNextTick() {
// 下次tick的时间点, 用于计算需要sleep的时间
long deadline = tickDuration * (tick + 1);
for (;;) {
// 计算需要sleep的时间, 之所以加999999后再除10000000, 是为了保证足够的sleep时间
// 例如:当deadline - currentTime=2000002的时候,如果不加999999,则只睡了2ms,
// 而2ms其实是未到达deadline这个时间点的,所有为了使上述情况能sleep足够的时间,加上999999后,会多睡1ms
final long currentTime = System.nanoTime() - startTime;
long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;
if (sleepTimeMs <= 0) {
if (currentTime == Long.MIN_VALUE) {
return -Long.MAX_VALUE;
} else {
return currentTime;
}
}
// 这里是因为windows平台的定时调度最小单位为10ms,如果不是10ms的倍数,可能会引起sleep时间不准确
if (PlatformDependent.isWindows()) {
sleepTimeMs = sleepTimeMs / 10 * 10;
if (sleepTimeMs == 0) {
sleepTimeMs = 1;
}
}
try {
Thread.sleep(sleepTimeMs);
} catch (InterruptedException ignored) {
if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
return Long.MIN_VALUE;
}
}
}
}
public Set<Timeout> unprocessedTimeouts() {
return Collections.unmodifiableSet(unprocessedTimeouts);
}
}
private static final class HashedWheelTimeout implements Timeout {
// 定义定时任务的3个状态:初始化、取消、过期
private static final int ST_INIT = 0;
private static final int ST_CANCELLED = 1;
private static final int ST_EXPIRED = 2;
// 用来CAS方式更新定时任务状态
private static final AtomicIntegerFieldUpdater<HashedWheelTimeout> STATE_UPDATER = AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimeout.class, "state");
private final HashedWheelTimer timer;
// 具体到期需要执行的任务
private final TimerTask task;
private final long deadline;
@SuppressWarnings({"unused", "FieldMayBeFinal", "RedundantFieldInitialization" })
private volatile int state = ST_INIT;
// remainingRounds will be calculated and set by Worker.transferTimeoutsToBuckets() before the
// HashedWheelTimeout will be added to the correct HashedWheelBucket.
// 离任务执行的轮数,当将次任务加入到格子中是计算该值,每过一轮,该值减一。
long remainingRounds;
// 双向链表结构,由于只有worker线程会访问,这里不需要synchronization / volatile
HashedWheelTimeout next;
HashedWheelTimeout prev;
// 定时任务所在的格子
HashedWheelBucket bucket;
HashedWheelTimeout(HashedWheelTimer timer, TimerTask task, long deadline) {
this.timer = timer;
this.task = task;
this.deadline = deadline;
}
@Override
public Timer timer() {
return timer;
}
@Override
public TimerTask task() {
return task;
}
@Override
public boolean cancel() {
// only update the state it will be removed from HashedWheelBucket on next tick.
if (!compareAndSetState(ST_INIT, ST_CANCELLED)) {
return false;
}
// If a task should be canceled we put this to another queue which will be processed on each tick.
// So this means that we will have a GC latency of max. 1 tick duration which is good enough. This way
// we can make again use of our MpscLinkedQueue and so minimize the locking / overhead as much as possible.
timer.cancelledTimeouts.add(this);
return true;
}
void remove() {
HashedWheelBucket bucket = this.bucket;
if (bucket != null) {
bucket.remove(this);
} else {
timer.pendingTimeouts.decrementAndGet();
}
}
public boolean compareAndSetState(int expected, int state) {
return STATE_UPDATER.compareAndSet(this, expected, state);
}
public int state() {
return state;
}
@Override
public boolean isCancelled() {
return state() == ST_CANCELLED;
}
@Override
public boolean isExpired() {
return state() == ST_EXPIRED;
}
// 过期并执行任务
public void expire() {
if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
return;
}
try {
task.run(this);
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
}
}
}
}
/**
* Bucket that stores HashedWheelTimeouts. These are stored in a linked-list like datastructure to allow easy
* removal of HashedWheelTimeouts in the middle. Also the HashedWheelTimeout act as nodes themself and so no
* extra object creation is needed.
*/
private static final class HashedWheelBucket {
// 指向格子中任务的首尾
private HashedWheelTimeout head;
private HashedWheelTimeout tail;
// 添加任务到当前格子
public void addTimeout(HashedWheelTimeout timeout) {
assert timeout.bucket == null;
timeout.bucket = this;
if (head == null) {
head = tail = timeout;
} else {
tail.next = timeout;
timeout.prev = tail;
tail = timeout;
}
}
// 过期并执行格子中的到期任务,tick到该格子的时候,worker线程会调用这个方法,根据deadline和remainingRounds判断任务是否过期
public void expireTimeouts(long deadline) {
HashedWheelTimeout timeout = head;
// 遍历格子中的所有定时任务
while (timeout != null) {
HashedWheelTimeout next = timeout.next;
// 定时任务到期
if (timeout.remainingRounds <= 0) {
next = remove(timeout);
if (timeout.deadline <= deadline) {
timeout.expire();
} else {
// 如果round数已经为0,deadline却>当前格子的deadline,说放错格子了,这种情况应该不会出现
throw new IllegalStateException(String.format("timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
}
} else if (timeout.isCancelled()) {
next = remove(timeout);
} else { /没有到期,轮数-1
timeout.remainingRounds --;
}
timeout = next;
}
}
// 将任务移出链表
public HashedWheelTimeout remove(HashedWheelTimeout timeout) {
HashedWheelTimeout next = timeout.next;
// remove timeout that was either processed or cancelled by updating the linked-list
if (timeout.prev != null) {
timeout.prev.next = next;
}
if (timeout.next != null) {
timeout.next.prev = timeout.prev;
}
if (timeout == head) {
// if timeout is also the tail we need to adjust the entry too
if (timeout == tail) {
tail = null;
head = null;
} else {
head = next;
}
} else if (timeout == tail) {
// if the timeout is the tail modify the tail to be the prev node.
tail = timeout.prev;
}
// null out prev, next and bucket to allow for GC.
timeout.prev = null;
timeout.next = null;
timeout.bucket = null;
timeout.timer.pendingTimeouts.decrementAndGet();
return next;
}
/**
* Clear this bucket and return all not expired / cancelled {@link Timeout}s.
*/
public void clearTimeouts(Set<Timeout> set) {
for (;;) {
HashedWheelTimeout timeout = pollTimeout();
if (timeout == null) {
return;
}
if (timeout.isExpired() || timeout.isCancelled()) {
continue;
}
set.add(timeout);
}
}
private HashedWheelTimeout pollTimeout() {
HashedWheelTimeout head = this.head;
if (head == null) {
return null;
}
HashedWheelTimeout next = head.next;
if (next == null) {
tail = this.head = null;
} else {
this.head = next;
next.prev = null;
}
// null out prev and next to allow for GC.
head.next = null;
head.prev = null;
head.bucket = null;
return head;
}
}
}
3.使用示例
3.1 使用示例
package com.zy.netty.netty07;
import io.netty.util.HashedWheelTimer;
import io.netty.util.Timeout;
import io.netty.util.Timer;
import io.netty.util.TimerTask;
import java.util.Objects;
import java.util.concurrent.TimeUnit;
public class HashedWheelTimerTest {
private static final HashedWheelTimer timer = new HashedWheelTimer(2L, TimeUnit.SECONDS, 2);
public static void main(String[] args) {
RetryTask retryTask = new RetryTask(3, 10);
timer.newTimeout(retryTask, 5L, TimeUnit.SECONDS);
}
private static class RetryTask implements TimerTask {
private final int retries;
private final long tick;
private int retryTimes = 1;
private RetryTask(int retries, long tick) {
this.retries = retries;
this.tick = tick;
}
@Override
public void run(Timeout timeout) throws Exception {
try {
System.out.println("第" + retryTimes + "次重试");
throw new RuntimeException("...");
// do task
} catch (Throwable e) {
retryTimes++;
if (retryTimes > retries) {
// log.error
} else {
rePutTimeout(timeout);
}
}
}
private void rePutTimeout(Timeout timeout) {
if (Objects.isNull(timeout)) {
return;
}
Timer timer = timeout.timer();
timer.newTimeout(timeout.task(), tick, TimeUnit.SECONDS);
}
}
}
3.2 注意事项
当前一个任务执行时间过长的时候,会影响后续任务的到期执行时间的。
也就是说其中的任务是串行执行的。所以,要求里面的任务都要短平快。
参考资源
https://cseweb.ucsd.edu/users/varghese/PAPERS/twheel.ps.Z