解决Eclipse 4.16(2020-06)及以上版本无法查看中文路径帮助文档的bug。

背景

之前公司内部的工具产品是基于Eclipse 4.10版本构建的，支持Windows/Linux/MacOS三个平台并且已经稳定使用多年。近期公司内部换用Mac系统的同事增多，很多同事反馈在Mac系统上无法正常打开产品软件，调研之后发现无法正常运行软件的Mac系统均为BigSur（11.x）及以上版本，想来应该是MacOS的更新较大导致了不兼容，使用纯Eclipse 4.10版测试后也验证了这一点。
于是升级产品软件的Eclipse基版本提上了日程，过程也很波折，因为我司软件使用的一个核心组件只支持JDK8并且没有升级的计划，而Eclipse从4.16版以后都需要JDK11及以上版本才能运行，且经测试确认4.16版本可以支持Mac的BigSur系统，于是定下了将产品软件的Eclipse基版本升级到4.16的计划。

问题现象

将产品软件的代码在Eclipse 4.16上编译后，运行时功能基本正常，只有帮助文档打不开了，帮助文档的目录结构可以正常显示，但点击具体的帮助项无法显示帮助文档内容，帮助内容区显示一片空白，将帮助文档的地址拷贝到浏览器中打开时会提示如下错误：

打不开帮助文档.png

而且在查看那些目录层级比较深的帮助文档时，还会出现如下错误：

HTTP ERROR 500 java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern


URI:
/help/topic/xxx.doc.user/doc/%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C%2F%E5%BF%AB%E9%80%9F%E5%85%A5%E9%97%A8%2F%E5%BC%80%E5%A7%8B.html 

STATUS:
500 

MESSAGE:
java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern 

SERVLET:
org.eclipse.equinox.http.jetty.internal.HttpServerManager$InternalHttpServiceServlet-46631fce 

CAUSED BY:
java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern 

Caused by:
java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern
    at java.net.URLDecoder.decode(URLDecoder.java:187)
    at org.eclipse.equinox.http.servlet.internal.HttpServiceRuntimeImpl.decode(HttpServiceRuntimeImpl.java:1289)
    at org.eclipse.equinox.http.servlet.internal.HttpServiceRuntimeImpl.getDispatchTargets(HttpServiceRuntimeImpl.java:544)
    at org.eclipse.equinox.http.servlet.internal.HttpServiceRuntimeImpl.getDispatchTargets(HttpServiceRuntimeImpl.java:274)
    at org.eclipse.equinox.http.servlet.internal.servlet.ProxyServlet.dispatch(ProxyServlet.java:144)
    at org.eclipse.equinox.http.servlet.internal.servlet.ProxyServlet.preprocess(ProxyServlet.java:115)
    at org.eclipse.equinox.http.servlet.internal.servlet.ProxyServlet.service(ProxyServlet.java:104)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.equinox.http.jetty.internal.HttpServerManager$InternalHttpServiceServlet.service(HttpServerManager.java:305)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:763)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:551)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1369)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:489)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1284)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    at org.eclipse.jetty.server.Server.handle(Server.java:501)
    at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
    at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
    at java.lang.Thread.run(Thread.java:748)

问题分析

从错误现象上看，像是帮助文档的寻址出现了问题，于是开始跟踪代码。
根据上面的堆栈信息，我们首先来看HttpServiceRuntimeImpl类的getDispatchTargets方法，相关代码如下：

public DispatchTargets getDispatchTargets(
        String requestURI, String extension, String queryString, Match match,
        RequestInfoDTO requestInfoDTO) {

        Collection<ContextController> contextControllers = getContextControllers(
            requestURI);

        if ((contextControllers == null) || contextControllers.isEmpty()) {
            return null;
        }

        String contextPath =
            contextControllers.iterator().next().getContextPath();

        requestURI = requestURI.substring(contextPath.length());

        int pos = requestURI.lastIndexOf('/');

        String servletPath = decode(requestURI);
        String pathInfo = null;

        if (match == Match.CONTEXT_ROOT) {
            pathInfo = Const.SLASH;
            servletPath = Const.BLANK;
        }

        do {
            for (ContextController contextController : contextControllers) {
                DispatchTargets dispatchTargets =
                    contextController.getDispatchTargets(
                        null, requestURI, servletPath, pathInfo,
                        extension, queryString, match, requestInfoDTO);

                if (dispatchTargets != null) {
                    return dispatchTargets;
                }
            }

            if ((match == Match.EXACT) || (match == Match.CONTEXT_ROOT) || (match == Match.DEFAULT_SERVLET)) {
                break;
            }

            if (pos > -1) {
                String newServletPath = requestURI.substring(0, pos);
                pathInfo = decode(requestURI.substring(pos));
                servletPath = decode(newServletPath);
                pos = servletPath.lastIndexOf('/');

                continue;
            }

            break;
        }
        while (true);

        return null;
    }

注意末尾的下述代码：

if (pos > -1) {
    String newServletPath = requestURI.substring(0, pos);
    pathInfo = decode(requestURI.substring(pos));
    servletPath = decode(newServletPath);
    pos = servletPath.lastIndexOf('/');

    continue;
}

Eclipse 4.16以前版本的此处代码为：

if (pos > -1) {
    String newServletPath = requestURI.substring(0, pos);
    pathInfo = requestURI.substring(pos);
    servletPath = newServletPath;
    pos = servletPath.lastIndexOf('/');

    continue;
}

可以看出4.16版之前的代码逻辑中没有使用decode方法，确认是4.16版新修改的逻辑。
但此处代码存在bug，对于非英文路径，pos=servletPath.lastIndexOf('/');获得的是解码后的servletPath中查找到的位置，但是下次循环时String newServletPath = requestURI.substring(0, pos);又使用该位置去原始URI中去截取字符串，必然会造成截取到的字符串不符合预期。
举个例子：

假设：requestURI = "/help/topic/xxx.doc.user/doc/%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C%2F%E5%BF%AB%E9%80%9F%E5%85%A5%E9%97%A8%2F%E5%BC%80%E5%A7%8B.html";
解码后：servletPath = "/help/topic/xxx.doc.user/doc/使用手册/快速入门/开始.html"
最后一个"/"的位置：pos = servletPath.lastIndexOf('/')=38
然后使用该pos去重新截取：String newServletPath = requestURI.substring(0, pos)="/help/topic/xxx.doc.user/doc/%E4%BD%BF%"

很明显最后截取到的路径字符串中，URL的转码已经不完整了，自然在后面decode时会报错。

那如果使用4.15版本的该类所属的jar替换掉4.16中的该jar包是否可行呢，测试后发现不可行，因为在后续调用中，会调用到如下代码：

代码来自org.eclipse.help.internal.webapp.servlet.EclipseConnector：

public void transfer(HttpServletRequest req, HttpServletResponse resp)
            throws IOException {
    // URL
    String pathInfo = req.getPathInfo();
    if (pathInfo == null)
        return;
    if (pathInfo.startsWith("/")) //$NON-NLS-1$
        pathInfo = pathInfo.substring(1);
    String query = req.getQueryString();
    String url = query == null ? pathInfo : (pathInfo + "?" + query); //$NON-NLS-1$
        ............
        String lowerCaseuRL = url.toLowerCase(Locale.ENGLISH);
    if (lowerCaseuRL.startsWith("jar:") //$NON-NLS-1$
            || lowerCaseuRL.startsWith("platform:") //$NON-NLS-1$
            || (lowerCaseuRL.startsWith("file:") && UrlUtil.wasOpenedFromHelpDisplay(url))) { //$NON-NLS-1$
        url = pathInfo; // without query

        // ensure the file is only accessed from a local installation
        if (BaseHelpSystem.getMode() == BaseHelpSystem.MODE_INFOCENTER
            || !UrlUtil.isLocalRequest(req)) {
            return;
        }
    } else {
        // enable activities matching url
        // HelpBasePlugin.getActivitySupport().enableActivities(url);

        url = URIUtil.fromString(url).toString();
        url = "help:" + url; //$NON-NLS-1$
    }

    URLConnection con = createConnection(req, resp, url);

    InputStream is;
    boolean pageNotFound = false;
        ..............

注意末尾的url = URIUtil.fromString(url).toString();，下面的URIUtil.fromString()方法的注释：

URI org.eclipse.core.runtime.URIUtil.fromString(String uriString) throws URISyntaxException
Returns a URI corresponding to the given unencoded string. This method will take care of encoding any characters that must be encoded according to the URI specification. This method must not be called with a string that already contains an encoded URI, since this will result in the URI escape character ('%') being escaped itself.
<dl>
<dt>参数：</dt>
<dd>uriString An unencoded URI string</dd>
<dt>返回：</dt>
<dd>A URI corresponding to the given string</dd>
</dl>

注意上面注释中的粗体，意思是说，调用该方法的参数url必须是未编码的url字符串，如果是已编码后的，则会对其中的“%”进行二次编码。最上面的时候我们已经看见了，点击帮助文档时传过来的requestURI已经是编码后，依据该URI截取的pathInfo也依然是编码后的，那么pathInfo传递到此处时亦然，那么经过该代码后url已然不正确了。

对比4.16版以前的此处代码：

String lowerCaseuRL = url.toLowerCase(Locale.ENGLISH);
if (lowerCaseuRL.startsWith("jar:") //$NON-NLS-1$
        || lowerCaseuRL.startsWith("platform:") //$NON-NLS-1$
        || (lowerCaseuRL.startsWith("file:") && UrlUtil.wasOpenedFromHelpDisplay(url))) { //$NON-NLS-1$
    url = pathInfo; // without query

    // ensure the file is only accessed from a local installation
    if (BaseHelpSystem.getMode() == BaseHelpSystem.MODE_INFOCENTER
            || !UrlUtil.isLocalRequest(req)) {
        return;
    }
} else {
    // enable activities matching url
    // HelpBasePlugin.getActivitySupport().enableActivities(url);

    url = "help:" + url; //$NON-NLS-1$
}

URLConnection con = createConnection(req, resp, url);

可以看出4.16版本之前的代码逻辑中并没有使用URIUitl.fromString()来重新构建URI。

好了，到现在为止，我们已经知道了在4.16版本上打不开中文帮助的原因了。

解决方法

第一种：
最简单的解决办法，将帮助文档的路径（文件夹和文件名称）定义为英文和数字。

第二种：
替换4.16以前版本的这两个jar包：

org.eclipse.equinox.http.servlet
org.eclipse.help.webapp

该方法仅适用于修复本地的Eclipse，或通过Eclipse的“导出产品”功能导出的RCP产品，不适用于通过tycho+P2来自动编译构建的RCP产品，因为在P2库中替换jar包非常麻烦，需要修改好多地方，修改后还需要重新签名jar包。

第三种：
因为我们的帮助文档是通过markdown来写的，然后通过maven插件在编译时自动生成符合Eclipse格式要求的toc和context文件，为了方便查找和管理，我们的帮助文档的目录结构采用的是中文的方式，也就是说不可能再一一替换为英文文件夹和文件名称。
还记得上面提到的HelpURLConnection的类吗，在调用getFile()方法构造出帮助文件路径后，会调用下面的方法来查找帮助文件：

private InputStream getLocalHelp(Bundle plugin) {
    // first try using content provider, then try to find the file
    // inside doc.zip, and finally try the file system
    InputStream in = ResourceLocator.openFromProducer(plugin,
            query == null ? getFile() : getFile() + "?" + query, //$NON-NLS-1$
            getLocale());

    if (in == null) {
        in = ResourceLocator.openFromPlugin(plugin, getFile(), getLocale());
    }
    if (in == null) {
        in = ResourceLocator.openFromZip(plugin, "doc.zip", //$NON-NLS-1$
                getFile(), getLocale());
    }
    return in;
}

可以看出在查找帮助文件的时候，首先会从producer中来查找，producer是一个扩展点，这就给我们提供了一个新的思路。
那么第三种方式即为在编译生成toc和context时自动替换路径中的中文为unicode码，结合org.eclipse.help.contentProducer扩展点来实现帮助文档文件的自定义查找。
附上主要的代码段吧：
中文转换为unicode码：

/**
 * 将指定的字符转换为unicode编码格式，例如字符“简”将转换为“u7b80”。
 * @param source
 * @return
 */
private String toUnicode(char source) {
    StringBuilder str = new StringBuilder("u");
    String hex = Integer.toHexString(source);
    if (hex.length() <= 2) {
        str.append("00"); 
    }
    str.append(hex);
    return str.toString();
}

定义contentProducer扩展，producer的类代码：

public class HelpContentProducer implements IHelpContentProducer {

    @SuppressWarnings("restriction")
    @Override
    public InputStream getInputStream(String pluginID, String href, Locale locale) {
        String filePathNoQuery = href;
        int queryIndex = href.indexOf('?');
        if (queryIndex >= 0) {
            filePathNoQuery = href.substring(0, queryIndex);
        }
        String decodedFilePath = decode(filePathNoQuery);

        return ResourceLocator.openFromPlugin(pluginID, decodedFilePath, locale == null ? null : locale.toString());
    }

    /**
     * 解码给定的帮助文件路径，替换其中的unicode码为其代表的字符，如果不含有unicode码，则返回字符串本身。
     * 
     * @param filePath
     * @return
     */
    private String decode(String filePath) {
        StringBuilder decodedStringBuilder = new StringBuilder();
        for (int i = 0; i < filePath.length(); i++) {
            char ch = filePath.charAt(i);
            boolean unicodeDecoded = false;
            if (ch == 'u') {
                String unicode = filePath.substring(i + 1, i + 5);
                try {
                    decodedStringBuilder.append((char) Integer.parseInt(unicode, 16));
                    unicodeDecoded = true;
                    i = i + 4;
                } catch (NumberFormatException e) {
                    // not unicode char, ignore
                }
            }
            if (!unicodeDecoded) {
                decodedStringBuilder.append(ch);
            }
        }
        return decodedStringBuilder.toString();
    }
}

最后别忘了在读取markdown生成html时替换markdown中的中文链接。

解决Eclipse 4.16(2020-06)及以上版本无法查看中文路径帮助文档的bug。