vscode插件开发之edm语法检查(1)

首先回顾之前Yomi分享的《vscode插件开发入门》
地址：Yomi博客

主要内容：

环境安装

npm install -g yo generator-code

项目搭建

yo code

image.png

根据提示填写项目信息后，很简单地就完成了项目的搭建。

项目启动与调试

用vscode打开文件夹，这时候就已经是个简单的扩展了！

在vscode里按F5就可以启动项目，vscode会打开一个新的窗口用于展示调试效果。

在新窗口中按下command + shift + p调出命令面板，输入hello world命令，回车，弹出来提示框如下：

image.png

项目目录

项目结构其实很简单，主要是清单文件package.json以及extension.js这个插件入口文件：

.
├── .vscode
│ ├── launch.json // Config for launching and debugging the extension
│ └── tasks.json // Config for build task that compiles TypeScript
├── .gitignore // Ignore build output and node_modules
├── README.md // Readable description of your extension's functionality
├── src
│ └── extension.ts // Extension source code
├── package.json // Extension manifest
├── tsconfig.json // TypeScript configuration

image.png

main定义了整个插件的主入口；
我们在contributes.commands里面注册了一个名为mypro.helloWorld的命令，并在src/extension.js中去实现了它（弹出一个Hello World的提示）(检查语法也是类似的实现过程。)；
但是仅仅这样还不够，命令虽然定义了，但是vscode还不知道啥时候去执行它，还需要在activationEvents添加上onCommand:mypro.helloWorld用来告诉vscode，当用户执行了这个命令操作时去执行前面我们定义的内容；
除了onCommand之外，还有onView、onUri、onLanguage等等，我们今天就要用到onLanguage，设置在打开哪种语言文件时激活拓展。

vscode诊断信息

除了上述的基本功能之外，我们还要用到vscode诊断信息。
vscode是支持错误检查的，我们要写一个edm语法插件就需要用到代码扫描的诊断信息，这个诊断信息是以vscode.Diagnostic为载体呈现的。

下图是vscode.Diagnostic类的成员和与相关类的关系：

image.png

以小到大，这些类为：

Position: 定位到一行上的一个字符的坐标
Range: 由起点和终点两个Position决定
Location: 一个Range配上一个URI
DiagnosticRelatedInformation: 一个Location配一个message
Diagnostic: 主体是一个message字符串，一个Range和一个DiagnosticRelatedInformation.

URL是Uniform Resource Locator的缩写，译为"统一资源定位符"。URL是一种URI，它标识一个互联网资源，并指定对其进行操作或获取该资源的方法。
最大的缺点是当信息资源的存放地点发生变化时，必须对URL作相应的改变。因此人们正在研究新的信息资源表示方法，例如：URI(Universal Resource Identifier)即"通用资源标识" 、URN（Uniform Resource Name）即"统一资源名"和URC（Uniform Resource Citation）即"统一资源引用符"等。
URI还在进一步的研究当中。研究的方向就是弥补URL的缺点。

构造一个诊断信息

以下图的html代码为例，保存为test.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>DOCUMENT</title>
    <style></style> <!--edm不支持style标签-->
</head>
<body>
    <tr> 
        <td >

        </td>
    </tr> 
</body>
</html>

在这个例子中使用了style标签，在edm中不支持。
出现问题的是第8行的第5字符到第19字符，所以我们构造(7,4)到(7,18)这样两个Position为首尾的Range。
有了Range，加上问题描述字符串，和问题严重程序三项，就可以构造出一个Diagnostic。

let diagnostic1:vscode.Diagnostic = new vscode.Diagnostic(
    new vscode.Range(
        new vscode.Position(7,4),
        new vscode.Position(7,18)
    ),
    'edm不支持style标签',
    vscode.DiagnosticSeverity.Warning
)

updateDiags完整的代码

export function updateDiags(
    document: vscode.TextDocument,
    collection: vscode.DiagnosticCollection
): void {
    let diagnostics: vscode.Diagnostic = new vscode.Diagnostic(
        new vscode.Range(new vscode.Position(7, 4), new vscode.Position(7, 18)),
        'edm不支持style标签',
        vscode.DiagnosticSeverity.Warning
    );
    diagnostics.source = 'edm Helper';
    diagnostics.relatedInformation = [
        new vscode.DiagnosticRelatedInformation(
            new vscode.Location(
                document.uri,
                new vscode.Range(new vscode.Position(7, 0), new vscode.Position(7, 18))
            ),
            'edm grammar check'
        ),
    ];
    diagnostics.code = 102;

    if (document && path.basename(document.uri.fsPath) === 'test.html') {
        collection.set(document.uri, [diagnostics]);
    } else {
        collection.clear();
    }
}

然后在active函数里调用刚刚写的方法

export function activate(context: ExtensionContext) {
    const diag_coll = vscode.languages.createDiagnosticCollection('basic-lint-1');

    if (vscode.window.activeTextEditor) {
        updateDiags(vscode.window.activeTextEditor.document, diag_coll);
    }

    context.subscriptions.push(
        vscode.window.onDidChangeActiveTextEditor((e: vscode.TextEditor | undefined) => {
            if (e !== undefined) {
                updateDiags(e.document, diag_coll);
            }
        })
    );
    context.subscriptions.push(
        workspace.onDidChangeTextDocument((e: vscode.TextDocumentChangeEvent | undefined) => {
            if (e !== undefined) {
                updateDiags(e.document, diag_coll);
            }
        })
    );
}

按F5运行一下，就可以看到检测结果啦

image.png

htmlparser和正则

这样一个简单的诊断内容就完成了，但我们在实际开发中应该根据html文档内容来确定诊断信息的代码Range和message。

但我们在active方法里直接拿到的html内容是个字符串，不太方便我们进行诊断。这就需要使用htmlparser来解析html内容,生成语法树。

下面是解析后语法树：

[
  {
    "raw": "!DOCTYPE html",
    "data": "!DOCTYPE html",
    "type": "directive",
    "name": "!DOCTYPE"
  },
  {
    "raw": "html lang=\"en\"",
    "data": "html lang=\"en\"",
    "type": "tag",
    "name": "html",
    "attribs": {
      "lang": "en"
    },
    "children": [
      {
        "raw": "head",
        "data": "head",
        "type": "tag",
        "name": "head",
        "children": [
          {
            "raw": "meta charset=\"UTF-8\"",
            "data": "meta charset=\"UTF-8\"",
            "type": "tag",
            "name": "meta",
            "attribs": {
              "charset": "UTF-8"
            }
          },
          {
            "raw": "meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\"",
            "data": "meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\"",
            "type": "tag",
            "name": "meta",
            "attribs": {
              "http-equiv": "X-UA-Compatible",
              "content": "IE=edge"
            }
          },
          {
            "raw": "meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\"",
            "data": "meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\"",
            "type": "tag",
            "name": "meta",
            "attribs": {
              "name": "viewport",
              "content": "width=device-width, initial-scale=1.0"
            }
          },
          {
            "raw": "title",
            "data": "title",
            "type": "tag",
            "name": "title",
            "children": [
              {
                "raw": "DOCUMENT",
                "data": "DOCUMENT",
                "type": "text"
              }
            ]
          },
          {
            "raw": "style",
            "data": "style",
            "type": "style",
            "name": "style"
          },
          {
            "raw": "link href=\"\"   ",
            "data": "link href=\"\"",
            "type": "tag",
            "name": "link",
            "attribs": {
              "href": ""
            }
          }
        ]
      },
      {
        "raw": "body",
        "data": "body",
        "type": "tag",
        "name": "body",
        "children": [
          {
            "raw": "tr  class    = \"a1111\" style= \"  position  :   relative;   position :     absolute; background: #333;  margin  : 0 auto;  background: rgb(red, green, blue);background: url(ddd);\"",
            "data": "tr  class    = \"a1111\" style= \"  position  :   relative;   position :     absolute; background: #333;  margin  : 0 auto;  background: rgb(red, green, blue);background: url(ddd);\"",
            "type": "tag",
            "name": "tr",
            "attribs": {
              "class": "a1111",
              "style": "  position  :   relative;   position :     absolute; background: #333;  margin  : 0 auto;  background: rgb(red, green, blue);background: url(ddd);"
            },
            "children": [
              {
                "raw": "td style=\"color:#EF6C00;font-size: 14px!important;\"",
                "data": "td style=\"color:#EF6C00;font-size: 14px!important;\"",
                "type": "tag",
                "name": "td",
                "attribs": {
                  "style": "color:#EF6C00;font-size: 14px!important;"
                },
                "children": [
                  {
                    "raw": " \n            ffffddddddd\n        ",
                    "data": " \n            ffffddddddd\n        ",
                    "type": "text"
                  }
                ]
              }
            ]
          },
          {
            "raw": "tr",
            "data": "tr",
            "type": "tag",
            "name": "tr",
            "children": [
              {
                "raw": "td style=\"border-radius: 0！important;\"",
                "data": "td style=\"border-radius: 0！important;\"",
                "type": "tag",
                "name": "td",
                "attribs": {
                  "style": "border-radius: 0！important;"
                },
                "children": [
                  {
                    "raw": "div",
                    "data": "div",
                    "type": "tag",
                    "name": "div",
                    "children": [
                      {
                        "raw": "1111",
                        "data": "1111",
                        "type": "text"
                      }
                    ]
                  },
                  {
                    "raw": "p",
                    "data": "p",
                    "type": "tag",
                    "name": "p",
                    "children": [
                      {
                        "raw": "222",
                        "data": "222",
                        "type": "text"
                      }
                    ]
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

根据解析后的语法树就可以轻松的去遍历内容来动态的生成正则，以此来拿到诊断起点和终点的Position.

以`style`标签和`class`属性为例：

在edm中,style标签和class属性不应该存在，在树结构中发现有style标签或class属性就去生成一个诊断结果，诊断结果的定位需要正则来帮忙。

将规则和提示定义在ruleCollection里，通过getDiagnostics方法传入内容进行匹配。

function testGrammar(dom: any[], textDocument: TextDocument) {
  dom.forEach((item: any) => {
    if (item.type === "style") getDiagnostics("style", textDocument);
    if (item.type === "tag") {
      let attribs = item.attribs || null;
      if (attribs && "class" in attribs)
          getDiagnostics("class", textDocument, attribs.class);
      if (item.children && item.name !== "title") {
          testGrammar(item.children, textDocument);
      }
    }
  });
}

let ruleCollection = (content?: string): any => {
  let ruleObject = {
    style: {
      reg: "<(|/)style.*>",
      message:
        "<style> is not supported by edm, use Inline CSS Style. \n edm不支持<style>标签，请使用内联样式",
      severity: 1,
      type: "style",
    },
    class: {
      reg: `${
        content
          ? `class(\\s{0,}=\\s{0,}|=)"${content}"`
          : `class(\\s{0,}=\\s{0,}|=)`
      }`,
      message:
        "class is not supported by edm, use Inline CSS Style. \n edm不支持class属性，请使用内联样式",
      severity: 1,
      type: "class",
    },
    position: {
      reg: `${
        content
          ? `position(\\s{0,}:\\s{0,}|:)${content}`
          : `position(\\s{0,}:\\s{0,}|:|)`
      }`,
      message: "position is not support by edm \n edm不支持position定位",
      severity: 1,
      type: "styleAttr",
    }
  };

  return ruleObject;
};

function getDiagnostics(
  content: string,
  textDocument: TextDocument,
  params?: string,
  message?: string
) {
  let m: RegExpExecArray | null;
  let text = textDocument.getText();
  let reg = params
    ? new RegExp(ruleCollection(params)[content].reg, "g")
    : new RegExp(ruleCollection()[content].reg, "g");
  while ((m = reg.exec(text))) {
    let diagnostic: Diagnostic = {
      severity: DiagnosticSeverity.Warning,
      range: {
        start: textDocument.positionAt(m.index),
        end: textDocument.positionAt(m.index + m[0].length),
      },
      message: message ? message : ruleCollection()[content].message,
      source: "edmHelper",
    };
    diagnostic.relatedInformation = [
      {
        location: {
          uri: textDocument.uri,
          range: Object.assign({}, diagnostic.range),
        },
        message: "edm grammar",
      },
    ];
    diagnostics.push(diagnostic);
  }
  connection.sendDiagnostics({ uri: textDocument.uri, diagnostics });
}

看一下效果

image.png

这样就能根据html文件内的具体内容来将有问题的地方完整地画上线啦。

语言服务器协议LSP

我们所写的语法诊断功能属于vscode编程语言拓展，这就要用到语言服务器协议LSP（language sever protocol）。

image.png

首先 language server是一种跨编辑器的语言支持实现规范。它由微软提出，目前 vscode 、vim、atom 都已经支持了这个规范。
LSP（language sever protocol）是用来处理语言解析等等东西在各式ide里应用的玩意。ide主要干的活还是要提供各类语言的解析跳转高亮等等的东西，所以lsp就显得很重要。放两张图就能大概理解LSP是具体干什么的，为什么需要LSP。

image

LSP主要解决了几个问题：
1、语言插件的复用。举个例子：Eclipse里C++相关的支持是用java写的，原因很简单：eclipse本身是java写的。但是这样如果要在vscode里面写C++那就又得拿js写一遍，相当于重复造了轮子。
2、进程独立。语言解析这件事本身是很重的，有时候会需要花非常长的时间来完成,要是这时候整个vscode都卡住那就别玩了。所以干脆把这块东西单独抽出来放在服务器上。

至于怎么用，请看下节分享。

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 220,295评论 6赞 512
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 93,928评论 3赞 396
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 166,682评论 0赞 357
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 59,209评论 1赞 295
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 68,237评论 6赞 397
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,965评论 1赞 308
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,586评论 3赞 420
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,487评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 46,016评论 1赞 319
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 38,136评论 3赞 340
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 40,271评论 1赞 352
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,948评论 5赞 347
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,619评论 3赞 331
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 32,139评论 0赞 23
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 33,252评论 1赞 272
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 48,598评论 3赞 375
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 45,267评论 2赞 358

vscode插件开发之edm语法检查(1)

主要内容：

环境安装

项目搭建

项目启动与调试

项目目录

vscode诊断信息

构造一个诊断信息

htmlparser和正则

以style标签和class属性为例：

语言服务器协议LSP

推荐阅读更多精彩内容

以`style`标签和`class`属性为例：