在介绍dubbo的cluster之前,先来看一下cluster在dubbo整体设计中的位置。按照官网的说法,Cluster作为路由层,封装多个提供者的路由及负载均衡,并桥接注册中心,以 Invoker
为中心,核心扩展接口为 Cluster
, Directory
, Router
, LoadBalance
# 其中 A->B 表示 A依赖B
Cluster -> Directory & LoadBalance
Directory -> Router
虚拟Invoker暴露流程程:Cluster => (Directory => Router) => LoadBalance => Invoker,依照这个顺序,我们先来看Cluster。Cluster不属于核心层,目的是将多个 Invoker 伪装成一个 Invoker,这样其它人只要关注 Protocol 层 Invoker 即可,加上 Cluster 或者去掉 Cluster 对其它层都不会造成影响,因为只有一个提供者时,是不需要 Cluster 的。本文主要关注Cluster层的容错及其核心接口(LoadBalance在之前的文章已经做过介绍)。
<T> Invoker<T> join(Directory<T> directory) throws RpcException;
public FailoverClusterInvoker(Directory<T> directory) {
失效转移:FailoverCluster -> FailoverClusterInvoker (Cluster默认SPI实现)
public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { //线程封闭,保证并发安全 List<Invoker<T>> copyInvokers = invokers; checkInvokers(copyInvokers, invocation); String methodName = RpcUtils.getMethodName(invocation); // 默认重试3次,至少重试1一次 int len = getUrl().getMethodParameter(methodName, Constants.RETRIES_KEY, Constants.DEFAULT_RETRIES) + 1; if (len <= 0) { len = 1; } // retry loop. RpcException le = null; // last exception. List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyInvokers.size()); // invoked invokers. Set<String> providers = new HashSet<String>(len); for (int i = 0; i < len; i++) { //Reselect before retry to avoid a change of candidate `invokers`. //NOTE: if `invokers` changed, then `invoked` also lose accuracy. //重试的时候,从directory拉取最新的Invoker列表 if (i > 0) { checkWhetherDestroyed(); copyInvokers = list(invocation); // check again checkInvokers(copyInvokers, invocation); } //调用AbstractClusterInvoker.select方法 Invoker<T> invoker = select(loadbalance, invocation, copyInvokers, invoked); invoked.add(invoker); RpcContext.getContext().setInvokers((List) invoked); try { // 若调用出现异常,异常处理之后,重试 Result result = invoker.invoke(invocation); return result; } catch (RpcException e) { if (e.isBiz()) { // biz exception. throw e; } le = e; } catch (Throwable e) { le = new RpcException(e.getMessage(), e); } finally { providers.add(invoker.getUrl().getAddress()); } } // 重试失败,直接抛异常 }
失效恢复:FailbackCluster -> FailbackClusterInvoker
protected Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { Invoker<T> invoker = null; try { checkInvokers(invokers, invocation); invoker = select(loadbalance, invocation, invokers, null); return invoker.invoke(invocation); } catch (Throwable e) { //调用失败,把当前Invoker包装成RetryTask,放入HashedWheelTimer的bucket logger.error("Failback to invoke method " + invocation.getMethodName() + ", wait for retry in background. Ignored exception: " + e.getMessage() + ", ", e); addFailed(loadbalance, invocation, invokers, invoker); return new RpcResult(); // ignore } } // 关注RetryTask的核心run方法 public void run(Timeout timeout) { try { //同样根据负载均衡策略,选择重试的Invoker Invoker<T> retryInvoker = select(loadbalance, invocation, invokers, Collections.singletonList(lastInvoker)); lastInvoker = retryInvoker; // 重试 retryInvoker.invoke(invocation); } catch (Throwable e) { logger.error("Failed retry to invoke method " + invocation.getMethodName() + ", waiting again.", e); if ((++retryTimes) >= retries) { logger.error("Failed retry times exceed threshold (" + retries + "), We have to abandon, invocation->" + invocation); } else { // 再次失败会重新放进bucket rePut(timeout); } } } // 调用失败的Invoker,放进定时器的bucket private void addFailed(LoadBalance loadbalance, Invocation invocation, List<Invoker<T>> invokers, Invoker<T> lastInvoker) { //初始化HashedWheelTimer定时器 if (failTimer == null) { synchronized (this) { if (failTimer == null) { failTimer = new HashedWheelTimer( new NamedThreadFactory("failback-cluster-timer", true),1,TimeUnit.SECONDS, 32, failbackTasks); } } } RetryTimerTask retryTimerTask = new RetryTimerTask(loadbalance, invocation, invokers, lastInvoker, retries, RETRY_FAILED_PERIOD); try { failTimer.newTimeout(retryTimerTask, RETRY_FAILED_PERIOD, TimeUnit.SECONDS); } catch (Throwable e) { logger.error("Failback background works error,invocation->" + invocation + ", exception: " + e.getMessage()); } }
快速失败:FailfastCluster -> FailfastClusterInvoker
public Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { checkInvokers(invokers, invocation); //调用父类select方法选择Invoker,并调用,失败则直接抛异常 Invoker<T> invoker = select(loadbalance, invocation, invokers, null); try { return invoker.invoke(invocation); } catch (Throwable e) { // 直接抛一场,忽略 } }
失效安全:FailsafeCluster -> FailsafeClusterInvoker
public Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { try { checkInvokers(invokers, invocation); //调用父类select方法选择Invoker,并调用,失败则返回空的RpcResult Invoker<T> invoker = select(loadbalance, invocation, invokers, null); return invoker.invoke(invocation); } catch (Throwable e) { logger.error("Failsafe ignore exception: " + e.getMessage(), e); return new RpcResult(); // ignore } }
Available :AvailableCluster-> AvailableClusterInvoker(无需负载均衡)
public Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { for (Invoker<T> invoker : invokers) { //比较简单,拿到可用的Invoker,直接调用,成功则成功,失败则抛RpcException; if (invoker.isAvailable()) { return invoker.invoke(invocation); } } throw new RpcException("No provider available in " + invokers); }
Forking : ForkingCluster -> ForkingClusterInvoker
public Result doInvoke(final Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { try { checkInvokers(invokers, invocation); final List<Invoker<T>> selected; final int forks = getUrl().getParameter(Constants.FORKS_KEY, Constants.DEFAULT_FORKS); final int timeout = getUrl().getParameter(Constants.TIMEOUT_KEY, Constants.DEFAULT_TIMEOUT); if (forks <= 0 || forks >= invokers.size()) { selected = invokers; } else { //选择Invoker做备用 selected = new ArrayList<>(); for (int i = 0; i < forks; i++) { // TODO. Add some comment here, refer chinese version for more details. Invoker<T> invoker = select(loadbalance, invocation, invokers, selected); if (!selected.contains(invoker)) { //Avoid add the same invoker several times. selected.add(invoker); } } } RpcContext.getContext().setInvokers((List) selected); final AtomicInteger count = new AtomicInteger(); //阻塞队列,用于存放异步结果 final BlockingQueue<Object> ref = new LinkedBlockingQueue<>(); // 调用备选Inboker,结果存放队列 for (final Invoker<T> invoker : selected) { executor.execute(new Runnable() { @Override public void run() { try { Result result = invoker.invoke(invocation); ref.offer(result); } catch (Throwable e) { int value = count.incrementAndGet(); if (value >= selected.size()) { ref.offer(e); } } } }); } try { //有结果则直接返回 Object ret = ref.poll(timeout, TimeUnit.MILLISECONDS); if (ret instanceof Throwable) { Throwable e = (Throwable) ret; throw new RpcException(e instanceof RpcException ? ((RpcException) e).getCode() : 0, "Failed to forking invoke provider " + selected + ", but no luck to perform the invocation. Last error is: " + e.getMessage(), e.getCause() != null ? e.getCause() : e); } return (Result) ret; } catch (InterruptedException e) { throw new RpcException("Failed to forking invoke provider " + selected + ", but no luck to perform the invocation. Last error is: " + e.getMessage(), e); } } finally { // clear attachments which is binding to current thread. RpcContext.getContext().clearAttachments(); } }
Mergeable :MergeableCluster -> MergeableClusterInvoker (无需负载均衡)
protected Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { checkInvokers(invokers, invocation); //方法是否支持merger String merger = getUrl().getMethodParameter(invocation.getMethodName(), Constants.MERGER_KEY); //不支持merger,则退化为availaClusterInvoker if (ConfigUtils.isEmpty(merger)) { // If a method doesn't have a merger, only invoke one Group for (final Invoker<T> invoker : invokers) { if (invoker.isAvailable()) { try { return invoker.invoke(invocation); } catch (RpcException e) { // 异常处理,略过 } } } return invokers.iterator().next().invoke(invocation); } //方法返回类型 Class<?> returnType; try { returnType = getInterface().getMethod( invocation.getMethodName(), invocation.getParameterTypes()).getReturnType(); } catch (NoSuchMethodException e) { returnType = null; } //异步调用结果map,<invoker.getUrl,Future<Result>> Map<String, Future<Result>> results = new HashMap<String, Future<Result>>(); for (final Invoker<T> invoker : invokers) { // 线程池处理异步调用 Future<Result> future = executor.submit(new Callable<Result>() { @Override public Result call() throws Exception { return invoker.invoke(new RpcInvocation(invocation, invoker)); } }); results.put(invoker.getUrl().getServiceKey(), future); } Object result = null; List<Result> resultList = new ArrayList<Result>(results.size()); //获取结果列表,用于后续合并 int timeout = getUrl().getMethodParameter(invocation.getMethodName(), Constants.TIMEOUT_KEY, Constants.DEFAULT_TIMEOUT); for (Map.Entry<String, Future<Result>> entry : results.entrySet()) { Future<Result> future = entry.getValue(); try { Result r = future.get(timeout, TimeUnit.MILLISECONDS); if (r.hasException()) { log.error("Invoke " + getGroupDescFromServiceKey(entry.getKey()) + " failed: " + r.getException().getMessage(), r.getException()); } else { resultList.add(r); } } catch (Exception e) { throw new RpcException("Failed to invoke service " + entry.getKey() + ": " + e.getMessage(), e); } } //异步invoker调用结果resultList if (resultList.isEmpty()) { return new RpcResult((Object) null); } else if (resultList.size() == 1) { return resultList.iterator().next(); } //方法返回类类型为 void,则直接返回 if (returnType == void.class) { return new RpcResult((Object) null); } //自定义merger值,以".merger"开头 if (merger.startsWith(".")) { merger = merger.substring(1); Method method; try { //获取方法 method = returnType.getMethod(merger, returnType); } catch (NoSuchMethodException e) { throw new RpcException("Can not merge result because missing method [ " + merger + " ] in class [ " + returnType.getClass().getName() + " ]"); } //设置方法访问权限 if (!Modifier.isPublic(method.getModifiers())) { method.setAccessible(true); } //拿到result中的第一个,拿到result的值 result = resultList.remove(0).getValue(); try { if (method.getReturnType() != void.class && method.getReturnType().isAssignableFrom(result.getClass())) { //根据自定义merge方法,合并resultList的结果 for (Result r : resultList) { result = method.invoke(result, r.getValue()); } } else { //无返回值,则只做merge for (Result r : resultList) { method.invoke(result, r.getValue()); } } } catch (Exception e) { throw new RpcException("Can not merge result: " + e.getMessage(), e); } } else { Merger resultMerger; //merger == default,则使用与returnType类型相匹配的默认merger if (ConfigUtils.isDefault(merger)) { resultMerger = MergerFactory.getMerger(returnType); } else { //否则,使用指定merger resultMerger = ExtensionLoader.getExtensionLoader(Merger.class).getExtension(merger); } if (resultMerger != null) { List<Object> rets = new ArrayList<Object>(resultList.size()); for (Result r : resultList) { rets.add(r.getValue()); } result = resultMerger.merge( rets.toArray((Object[]) Array.newInstance(returnType, 0))); } else { throw new RpcException("There is no merger to merge result."); } } return new RpcResult(result); }
广播。 :BroadcastCluster -> BroadcastClusterInvoker (无需负载均衡)
public Result doInvoke(final Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { checkInvokers(invokers, invocation); RpcContext.getContext().setInvokers((List) invokers); RpcException exception = null; Result result = null; // 依次调用所有Invoker,异常则记录日志,返回结果以最后一个Invoker调用结果为准 for (Invoker<T> invoker : invokers) { try { result = invoker.invoke(invocation); } catch (RpcException e) { exception = e; logger.warn(e.getMessage(), e); } catch (Throwable e) { exception = new RpcException(e.getMessage(), e); logger.warn(e.getMessage(), e); } } if (exception != null) { throw exception; } return result; }
public AbstractClusterInvoker(Directory<T> directory, URL url) {
if (directory == null) {
throw new IllegalArgumentException("service directory == null");
this.directory = directory;
//sticky: invoker.isAvailable() should always be checked before using when availablecheck is true.
this.availablecheck = url.getParameter(Constants.CLUSTER_AVAILABLE_CHECK_KEY, Constants.DEFAULT_CLUSTER_AVAILABLE_CHECK);
- 先判断是否开启粘性策略(),值取自URL参数sticky;
- 当前粘性Invoker是否在可用列表,不可用则置空;
- 若采用粘性策略,当前stickyInvoker可用,且该stickyInvoker未被使用过(虚拟Invoker执行单次invoke,当前Invoker从未被选中过;尽可能保证平均调用每个原始Invoker),直接返回stickyInvoker
- 否则采用负载均衡策略选择一个原始Invoker返回(详情参考后面的doSelect方法)
- 若采用粘性策略,则把4中的Invoker赋值给stickInvoker;
protected Invoker<T> select(LoadBalance loadbalance, Invocation invocation,
List<Invoker<T>> invokers, List<Invoker<T>> selected) throws RpcException {
if (CollectionUtils.isEmpty(invokers)) {
return null;
String methodName = invocation == null ? StringUtils.EMPTY : invocation.getMethodName();
boolean sticky = invokers.get(0).getUrl()
.getMethodParameter(methodName, Constants.CLUSTER_STICKY_KEY, Constants.DEFAULT_CLUSTER_STICKY);
//ignore overloaded method
// stickyInvoker不包含在invokers中,则stickyInvoker置空
if (stickyInvoker != null && !invokers.contains(stickyInvoker)) {
stickyInvoker = null;
//ignore concurrency problem
// 启用sticky,且stickyInvoker非空,stickyInvoker未被使用过,且stickyInvoker可用的情况下,返回stickyInvoker
if (sticky && stickyInvoker != null && (selected == null || !selected.contains(stickyInvoker))) {
if (availablecheck && stickyInvoker.isAvailable()) {
return stickyInvoker;
// 否则利用负载均衡策略选择一个invoker,重点关注
Invoker<T> invoker = doSelect(loadbalance, invocation, invokers, selected);
if (sticky) {
stickyInvoker = invoker;
return invoker;
- invokers.size = 1,则直接返回,否则执行步骤2;
- 利用负载均衡选择一个invoker,然后执行步骤3;
- 若selected非空,且2中的invoker已在selected中,则执行步骤4进行重新选择;
- 重新选择,结果非空则直接返回,否则执行步骤5;
- 重新选择结果为空,则根据hash规则,直接从invokers中直接返回一个结果
private Invoker<T> doSelect(LoadBalance loadbalance, Invocation invocation,
List<Invoker<T>> invokers, List<Invoker<T>> selected) throws RpcException {
if (CollectionUtils.isEmpty(invokers)) {
return null;
if (invokers.size() == 1) {
return invokers.get(0);
Invoker<T> invoker = loadbalance.select(invokers, getUrl(), invocation);
//If the `invoker` is in the `selected` or invoker is unavailable && availablecheck is true, reselect.
// selected非空,且通过负载均衡得到的invoker已在selected中,或者选中的invoker不可用则重新选择。
if ((selected != null && selected.contains(invoker))
|| (!invoker.isAvailable() && getUrl() != null && availablecheck)) {
try {
// 重新选择,重点关注
Invoker<T> rinvoker = reselect(loadbalance, invocation, invokers, selected, availablecheck);
if (rinvoker != null) {
invoker = rinvoker;
} else {
//Check the index of current selected invoker, if it's not the last one, choose the one at index+1.
int index = invokers.indexOf(invoker);
// 重新选择失败,则利用mod重新选择一个invoker
try {
//Avoid collision
invoker = invokers.get((index + 1) % invokers.size());
} catch (Exception e) {
logger.warn(e.getMessage() + " may because invokers list dynamic change, ignore.", e);
} catch (Throwable t) {
logger.error("cluster reselect fail reason is :" + t.getMessage() + " if can not solve, you can set cluster.availablecheck=false in url", t);
return invoker;
- 初始化reselectInvokers列表,size= 1 或者 invokers.size -1,用于缓存未被选中过的Invoker;
- reselectInvokers非空,则根据负载均衡策略,选择一个invoker,直接返回,否则执行3;
- reselectInvokers为空,即invokers中所有invoker都在selected中,则从selected中过滤可用invoer,存放至reselectInvokers;
- 重复步骤2,否则返回null
private Invoker<T> reselect(LoadBalance loadbalance, Invocation invocation,
List<Invoker<T>> invokers, List<Invoker<T>> selected, boolean availablecheck) throws RpcException {
//Allocating one in advance, this list is certain to be used.
List<Invoker<T>> reselectInvokers = new ArrayList<>(
invokers.size() > 1 ? (invokers.size() - 1) : invokers.size());
// First, try picking a invoker not in `selected`.
// 过滤未被selected的invoker,存放至reselectInvoker
for (Invoker<T> invoker : invokers) {
if (availablecheck && !invoker.isAvailable()) {
if (selected == null || !selected.contains(invoker)) {
if (!reselectInvokers.isEmpty()) {
return loadbalance.select(reselectInvokers, getUrl(), invocation);
// Just pick an available invoker using loadbalance policy
// 若reselectInvokers为空,则从selected中过滤可用invoker,存放至reselectInvokers
if (selected != null) {
for (Invoker<T> invoker : selected) {
if ((invoker.isAvailable()) // available first
&& !reselectInvokers.contains(invoker)) {
if (!reselectInvokers.isEmpty()) {
return loadbalance.select(reselectInvokers, getUrl(), invocation);
return null;