JS正则表达式

正则表达式由普通字符和特殊字符(元字符)构成。元字符是在正则表达式中具有特殊意义的专用符号。

创建正则表达式

字面量方式


let hello = "Hello, World";

console.log(/o/.test(hello)); // true

使用//来创建正则表达式，但正则表达式中不能含有变量，可以使用eval函数解决这个问题。


let hello = "Hello, World";

let letter = "o";

console.log(eval(`/${letter}/`).test(hello)); // true

对象方式


let testStr = "Hello, World";

let letter = "o";

let reg = new RegExp(letter);

console.log(reg.test(testStr)); // true

预定义字符

元字符

元字符	说明
\d	匹配0-9之间的任一数字，相当于[0-9]
\D	匹配所有0-9以外的字符，相当于[^0-9]
\w	匹配任意的字母、数字、下划线，相当于[A-Za-z0-9_]
\W	除任意的字母、数字、下划线以外的字符，相当于[^A-Za-z0-9_]
\s	匹配空格(包括换行符、制表符、空格符)，相当于[\t\r\n\v\f]
\S	匹配非空格字符，相当于[^\t\r\n\v\f]
.	匹配除换行符外的任意字符

转义字符

\是转义字符，用于改变原字符含义。在正则表达式中，某些字符具有特殊含义，使用转义字符后表达它原来的意思。例如.表示除换行外的任意字符，\.表示.本身。

字面量方式的正则表达式和对象方式创建的正则表达式使用转义略有不同。例子如下


const price = 11.11;

console.log(/\d+\.\d+/.test(price)); // true

const reg = new RegExp("\\d+\\.\\d+");

console.log(reg.test(price)); // true

由于在字符串中"d"等价于"\d"，所以对象方式创建的正则表达式表达转义要使用两个\\

选择符

|表示几个表达式符合任何一个即可


var rg = /^abc|edg$/

console.log(rg.test('edgf')) // false

边界符

正则表达式中的边界符用来提示字符所处的位置，主要有两个字符

边界符	说明
^	表示匹配行首的文本
$	表示匹配行尾的文本，忽略换行符

匹配行首


var regexp = new RegExp(/^abc/)

console.log(regexp.test('abcd')) // true

console.log(regexp.test('dabc')) // false

匹配行尾


var regexp = new RegExp(/abc$/)

console.log(regexp.test('abcd')) // false

console.log(regexp.test('dabc')) // true

精确匹配


var regexp = new RegExp(/^abc$/)

console.log(regexp.test('abc')) // true

console.log(regexp.test('dabc')) // false

console.log(regexp.test('abcabc')) // false

正则表达式中的空格

正则表达式中的空格被当做普通字符来对待

所有字符

使用[\s\S]或者[\d\D]表示所有字符

案例：电话号码

全国座机号码：两种格式：010-12345678 或者 0530-1234567


var reg = /^\d{3}-\d{8}|\d{4}-\d{7}$/

console.log(reg.test('010-23333333'))

案例：表单验证


var regmsg = /^\d{6}$/

var msg = document.querySelector('.msg')

regExp(msg, regmsg)

function regExp(ele, reg) {

  ele.onblur = function () {

    if (reg.test(this.value)) {

      // 符合条件

    } else {

      // 不符合条件

    }

  }

}

模式修饰符

/表达式/[switch] switch(也称为修饰符)，表示按照什么方式来匹配

修饰符	说明
无	只匹配一个结果
g	全局匹配
i	忽略大小写
gi	全局匹配+忽略大小写
m	多行匹配。解决了^和$不能匹配字符串中含有换行的情况
s	单行匹配，忽略换行符，可以使用`.`匹配所有字符
y	从lastIndex开始匹配，匹配不成功则结束匹配
u	按utff-8匹配，主要针对需要多字节表示的文字，如汉字

单行模式

正则表达式后面加s表示使用单行模式，在这种模式下，.可以匹配所有字符

没有使用单行模式


const hello = `

Hello,

World!

`;

console.log(hello.match(/He.+/)[0]);

// Hello,

使用单行模式


const hello = `

Hello,

World!

`;

console.log(hello.match(/He.+/s)[0]);

// Hello,

//  World!

lastIndex

RegExp对象的lastIndex可以获取或设置匹配的起始位置，必须结合g修饰符使用


let hello = "Hello, World";

let reg = /He.+/g;

console.log(`初始lastIndex ${reg.lastIndex}`); // 初始lastIndex 0

console.log(reg.exec(hello)); // [ 'Hello, World', index: 0, input: 'Hello, World', groups: undefined ]

console.log(`匹配成功后lastIndex ${reg.lastIndex}`); // 匹配成功后lastIndex 12

reg.lastIndex = 1;

console.log(`修改后lastIndex ${reg.lastIndex}`); // 修改后lastIndex 1

console.log(reg.test(hello)); // false

console.log(`匹配失败后lastIndex ${reg.lastIndex}`); // 匹配失败后lastIndex 0

console.log(reg.exec(hello)); // [ 'Hello, World', index: 0, input: 'Hello, World', groups: undefined ]

y模式和g模式的区别


let hello = "hello, world";

console.log(/o/g.exec(hello)); // [ 'o', index: 4, input: 'hello, world', groups: undefined ]

console.log(/o/y.exec(hello)); // null

[] 原子表

在字符串中匹配某个字符


var rg = /[abcd]/

console.log(rg.test('ddos')) // true

只需满足[]中任何一个字符


var rg = /^[abcd]$/ //  a、b、c、d中的一个

console.log(rg.test('df')) // false

[-] 方括号内部范围符

表示某个范围内的任意字符


var rg = /^[a-z]$/

console.log(rg.test('c')) // true

字符组合


var rg = /^[a-zA-Z0-9]$/

console.log(rg.test(9)) // true

[^] 方括号内部取反符

表示不能出现方括号内部任意字符


var rg = /^[^abc]$/

console.log(rg.test('bcd')) // false

原子表中无需转义


let str = "(bingo)";

console.log(/[.+]/g.test(str)); // false

上面原子表中.和+都表示原来的意思

() 原子组

用()表示一个原子组。原子组可以包含多个原子


var correct = /^(abc){3}$/  // 正确写法

var wrong = /^abc{3}$/  // 错误写法，只是c重复三次

test


const str = "abfffab";

console.log(/^(\w+)(.*)(\1)$/g.test(str)); // true

上面正则表达式检测开头结尾是否相同。\1表示第一个原子组的内容，\2表示第二个原子组的内容，以此类推。

match

在没有 g 修正符时匹配到一个结果就会停止匹配，返回的信息包含以下内容

变量	说明
0	匹配到的完整内容
1,2....	匹配到的原子组
index	在原字符串中的位置
input	原字符串
groups	命名分组

案例：邮箱匹配


let str = `admin@baidu.com.cn`;

let reg = /^[\w-]+@([\w-]+\.)+(org|com|cc|cn)$/;

console.log(str.match(reg)); 

// [

//  "admin@baidu.com.cn",

//  "com.",

//  "cn",

//  (index: 0),

//  (input: "admin@baidu.com.cn"),

//  (groups: undefined),

// ];

match在使用g修饰符时，返回所有匹配结果，但不包括匹配的详细信息


let str = `

<h1>item1</h1>

<h2>item2</h2>

<h3>item3</h3>

`;

// \1 引用第一组内容

let reg = /<(h[1-6])>([\s\S]*)<\/\1>/g;

// $2 代表第二组匹配的数据

console.log(str.match(reg));

// [ '<h1>item1</h1>', '<h2>item2</h2>', '<h3>item3</h3>' ]

引用分组

\n 在匹配时引用原子组， $n 指在替换时使用匹配的组数据


let str = `

<h1>item1</h1>

<h2>item2</h2>

<h3>item3</h3>

`;

// \1引用第一组内容

let reg = /<(h[1-6])>([\s\S]*)<\/\1>/gi;

// $2 代表第二组匹配的数据

console.log(str.replace(reg, `<span>$2</span>`));

//  <span>item1</span>

//  <span>item2</span>

//  <span>item3</span>

隐藏分组

如果只希望组参与匹配，不显示在结果中，使用?:，在exec函数中有效


let str = `

<h1>item1</h1>

<h2>item2</h2>

<h3>item3</h3>

`;

// \1 引用第一组内容

let reg = /<(h[1-6])>(?:[\s\S]*)<\/\1>/g;

// $2 代表第二组匹配的数据

console.log(reg.exec(str));

// [

//    '<h1>item1</h1>',

//    'h1',

//    index: 1,

//    input: '\n<h1>item1</h1>\n<h2>item2</h2>\n<h3>item3</h3>\n',

//    groups: undefined

// ]

console.log(reg.exec(str));

// [

//    '<h2>item2</h2>',

//    'h2',

//    index: 16,

//    input: '\n<h1>item1</h1>\n<h2>item2</h2>\n<h3>item3</h3>\n',

//    groups: undefined

// ]

分组别名

给每个分组取一个别名使用?<tabName>。


let str = "168.120.99.109uu168.120.99.109";

let reg = /(?<n1>\d+)\.(?<n2>\d+)\.(?<n3>\d+)\.(?<n4>\d+)/g;

for (let res of str.matchAll(reg)) {

  console.log(res);

}

//  [

//    '168.120.99.109',

//    '168',

//    '120',

//    '99',

//    '109',

//    index: 0,

//    input: '168.120.99.109uu168.120.99.109',

//    groups: [Object: null prototype] {

//      n1: '168',

//      n2: '120',

//      n3: '99',

//      n4: '109'

//    }

//  ]

//  [

//    '168.120.99.109',

//    '168',

//    '120',

//    '99',

//    '109',

//    index: 16,

//    input: '168.120.99.109uu168.120.99.109',

//    groups: [Object: null prototype] {

//      n1: '168',

//      n2: '120',

//      n3: '99',

//      n4: '109'

//    }

//  ]

重复匹配

基本使用

量词	说明
*	重复零次或更多次
+	重复一次或者更多次
?	重复零次或者一次
{n}	重复n次
{n,}	重复n次或者更多次
{n, m}	重复n次到m次


var rg = /^a*$/ // 必须以a开头，或者为空

console.log(rg.test('')) // true

console.log(rg.test('c')) // false

案例：用户名验证

假定用户名规定只能为英文字母、数字、下划线或者短横线组成，并且用户名长度为6-16位
/^[a-zA-Z0-9-_]{6,16}$/
当表单失去焦点时开始验证
如果符合正则规范，则让后面的span标签添加right类
如果不符合正则规范，则让后面的span标签添加wrong类


var rg = /^[a-zA-Z0-9_-]{6,16}$/

var uname = document.querySelector('.uname')

var span = document.querySelector('span')

uname.onblur = function () {

  if (rg.test(this.value)) {

    span.className = 'right'

    span.innerHTML = '用户名输入正确'

  } else {

    span.className = 'wrong'

    span.innerHTML = '用户名输入错误'

  }

}

禁止贪婪

正则表达式在进行重复匹配时候，默认使用的是贪婪匹配，也就是尽量匹配多的内容，可以使用?来禁止贪婪匹配

使用	说明
*?	重复任意次，但尽可能少重复
+?	重复1次或更多次，但尽可能少重复
??	重复0次或1次，但尽可能少重复
{n,m}?	重复n到m次，但尽可能少重复
{n,}?	重复n次以上，但尽可能少重复

案例：贪婪模式和非贪婪模式区别


let str = "aaaaaa";

console.log(str.match(/aa+?/g)); // [ 'aa', 'aa', 'aa' ]

console.log(str.match(/aa+/g)); // [ 'aaaaaa' ]

全局匹配

match

match返回第一个匹配的结果以及匹配细节，在g模式下返回所有匹配结果但不包含匹配细节


let str = "aaaaaa";

console.log(str.match(/aa+?/)); // [ 'aa', index: 0, input: 'aaaaaa', groups: undefined ]

console.log(str.match(/aa+?/g)); // [ 'aa', 'aa', 'aa' ]

matchAll

返回迭代对象，包含每个匹配的结果以及匹配细节。注意，必须配合g模式


let str = "aaaaaa";

let reg = /aa+?/g;

for (const res of str.matchAll(reg)) {

  console.log(res);

}

// [ 'aa', index: 0, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 2, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 4, input: 'aaaaaa', groups: undefined ]

exec

配合g模式获取所有结果和匹配细节


let str = "aaaaaa";

let reg = /aa+?/g;

while ((res = reg.exec(str))) {

  console.log(res);

}

// [ 'aa', index: 0, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 2, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 4, input: 'aaaaaa', groups: undefined ]

字符方法

search

查找目标子字符串，返回目标子串在原字符串中位置


let hello = "Hello, World!";

console.log(hello.search(/o/g)); // 4

match

返回匹配结果和匹配详情(在g模式下只有结果)


let hello = "Hello, World!";

console.log(hello.match(/o/)); // [ 'o', index: 4, input: 'Hello, World!', groups: undefined ]

console.log(hello.match(/o/g)); // [ 'o', 'o' ]

matchAll

返回结果的迭代对象，必须配合g模式使用


let str = "aaaaaa";

let reg = /aa+?/g;

for (const res of str.matchAll(reg)) {

  console.log(res);

}

// [ 'aa', index: 0, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 2, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 4, input: 'aaaaaa', groups: undefined ]

split

分隔字符串，返回一个数组


let str = "2019/02-12";

console.log(str.split(/-|\//)); // [ '2019', '02', '12' ]

replace

replace()方法实现替换字符串操作，用来替换的参数可以是一个字符串或者一个正则表达式 str.replace(regexp/substr, replacement)。如果需要替换所有匹配项，带上修饰符g

特殊变量

变量	说明
$$	插入一个 "$"。
$&	插入匹配的子串。
$`	插入当前匹配的子串左边的内容。
$'	插入当前匹配的子串右边的内容。
$n	插入第n个匹配的括号匹配字符串


let str = "=△=";

console.log(str.replace(/△/, "$'$&$'")); // ==△==

回调函数

变量名	代表的值
match	匹配的子串(对应于上述的$&)
p1,p2, ...	每个括号匹配的结果
offset	匹配到子串在原串中位置
string	被匹配的原字符串
NamedCaptureGroup	命名捕获组匹配的对象

如果正则表达式中有分组别名，则最后一个参数是命名分组的匹配结果。


function replacer(match, p1, p2, p3, offset, string) {

  // p1 is nondigits, p2 digits, and p3 non-alphanumerics

  console.log(p1); // abc

  console.log(p2); // 12345

  console.log(p3); // #$*%

  return [p1, p2, p3].join(" - ");

}

var newString = "abc12345#$*%".replace(/([^\d]*)(\d*)([^\w]*)/, replacer);

console.log(newString); // abc - 12345 - #$*%

正则方法

正则对象RegExp提供的方法

test

判断是否符合正则表达式


const str = "abfffab";

console.log(/^(\w+)(.*)(\1)$/g.test(str)); // true

exec

每次循环返回下一个匹配项，没有返回null，一般带上修饰符g


let str = "aaaaaa";

let reg = /aa+?/g;

while ((res = reg.exec(str))) {

  console.log(res);

}

// [ 'aa', index: 0, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 2, input: 'aaaaaa', groups: undefined ]

// [ 'aa', index: 4, input: 'aaaaaa', groups: undefined ]

断言匹配

零宽先行断言

(?=exp)匹配后面为exp的内容，reg匹配的内容必须在(?=exp)前面


let hello = "Hello, World!";

let reg = /He\w+(?=,)/g;

console.log(reg.test(hello)); // true

零宽后行断言

(?<=exp)匹配前面为exp的内容，reg匹配的内容必须在(?<=exp)后面


let str = "name: Alice";

let reg = /(?<=name: )\w+/;

console.log(str.match(reg)); // [ 'Alice', index: 6, input: 'name: Alice', groups: undefined ]

零宽负向先行断言

(?!exp)匹配后面不为exp的内容，reg匹配的内容必须在(?!exp)前面


let str = "hello123, world";

let reg = /[a-z]+(?!\d+)/g;

console.log(str.match(reg)); // [ 'hell', 'world' ]

匹配后面不为数字的内容

零宽负向后行断言

(?<!exp)匹配前面不为exp的内容，reg匹配的内容必须在(?<!exp)后面


let str = "hello123, world";

let reg = /(?<!\d+)[a-z]+/g;

console.log(str.match(reg)); // [ 'hello', 'world' ]