正则表达式

findall的用法：

re.findall可以获取字符串中所有匹配的字符串,返回结果是列表

（1）基本语法：

import re

re.findall("c","caiyingyi")

或者：

import re

rule = re.compile("c")

rule.findall("caiyingyi")

（2）findall函数的用法

>>> import re

>>> s = "adfad asdfasdf asdfas asdfawef asd adsfas "

>>> reObj1 = re.compile('((\w+)\s+\w+)')

>>> reObj1.findall(s)

[('adfad asdfasdf', 'adfad'), ('asdfas asdfawef', 'asdfas'), ('asd adsfas', 'asd')]

当给出的正则表达式中带有多个括号时，列表的元素为多个字符串组成的tuple，tuple中字符串个数与括号对数相同，字符串内容与每个括号内的正则表达式相对应，并且排放顺序是按括号出现的顺序。

>>> reObj2 = re.compile('(\w+)\s+\w+')

>>> reObj2.findall(s)

['adfad', 'asdfas', 'asd']

当给出的正则表达式中带有一个括号时，列表的元素为字符串，此字符串的内容与括号中的正则表达式相对应（不是整个正则表达式的匹配内容）。

>>> reObj3 = re.compile('\w+\s+\w+')

>>> reObj3.findall(s)

['adfad asdfasdf', 'asdfas asdfawef', 'asd adsfas']

当给出的正则表达式中不带括号时，列表的元素为字符串，此字符串为整个正则表达式匹配的内容。

search的用法:

若string中包含pattern子串，则返回Match对象，否则返回None，注意，如果string中存在多个pattern子串，只返回第一个。返回结果是字符串

import re

text = "JGood is a handsome boy, he is cool, clever, and so on..."

m = re.search(r'\shan(ds)ome\s', text).group()

match的用法：

re.match只匹配字符串的开始，如果字符串开始不符合正则表达式，则匹配失败，函数返回None。（而re.search匹配整个字符串，直到找到一个匹配。）

返回结果是字符串。

import re

text = "JGood is a handsome boy, he is cool, clever, and so on..."

m = re.match(r"(\w+)\s", text).group()

group的用法:

若匹配成功，match()/search()返回的是Match对象，获取匹配结果需要调用Match对象的group()、groups或group(index)方法。

group()：母串中与模式pattern匹配的子串；

group(0)：结果与group()一样；

groups()：所有group组成的一个元组，group(1)是与patttern中第一个group匹配成功的子串，group(2)是第二个，依次类推，如果index超了边界，抛出IndexError；

>>> import re

>>> s = '23432werwre2342werwrew'

>>> p = r'(\d*)([a-zA-Z]*)'

>>> m = re.match(p,s)

>>> m.group()

'23432werwre'

>>> m.group(0)

'23432werwre'

>>> m.group(1)

'23432'

>>> m.group(2)

'werwre'

>>> m.groups()

('23432', 'werwre')

split的用法

按照能够匹配的子串将string分割后返回列表。

可以使用re.split来分割字符串，如：re.split(r'\s+', text)；将字符串按空格分割成一个单词列表。

re.split('\d+','one1two2three3four4five5')

执行结果如下：

['one', 'two', 'three', 'four', 'five', '']

sub的用法

用re替换string中每一个匹配的子串后返回替换后的字符串。

import re

text = "JGood is a handsome boy, he is cool, clever, and so on..."

re.sub(r'\s+', '-', text)

执行结果如下：

“JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on...”

最后编辑于：2017.12.10 05:18:45

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

正则表达式