通配符与正则表达式
通配符
通配符是用来匹配文件名的(最起码linux系统中是这样的)。
正则表达式与通配符的区别
正则表达式用来在文件中匹配符合条件的字符串,正则是包含匹配。grep、awk、sed等命令可以支持正则表达式。
以下是列出install.log中所包含字符串net的行:
[root@localhost ~]# grep net install.log -n
203:Installing net-tools-1.60-110.el6_2.x86_64
413:Installing net-snmp-libs-5.5-49.el6.x86_64
521:Installing python-netaddr-0.7.5-4.el6.noarch
867:Installing net-snmp-5.5-49.el6.x86_64
929:Installing system-config-network-tui-1.6.0.el6.2-1.el6.noarch
通配符用来匹配符合条件的文件名,通配符是完全匹配。ls、find、cp这些命令不支持正则表达式,所以只能使用shell自己的通配符来进行匹配了。
[root@localhost ~]# ls
anaconda-ks.cfg Downloads Hello.sh Music rightfile testfile
Desktop error install.log Pictures Templates Videos
Documents errorfile install.log.syslog Public test
[root@localhost ~]# ls install.log
install.log
[root@localhost ~]# ls install
ls: cannot access install: No such file or directory
基础正则表达式
“*”前一个字符匹配0次,或任意多次
[root@localhost ~]# vim test
[root@localhost ~]# cat test
a1
aa2
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "a*" test #匹配所有内容,包括空白行
a1
aa2
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "aa*" test #匹配至少包含有一个a的行
a1
aa2
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "aaa*" test #匹配最少包含两个连续a的字符串
aa2
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "aaaa*" test #匹配最少包含三个连续a的字符串
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "aaaaa*" test #匹配最少包含四个连续a的字符串
aaaa4
aaaaa5
[root@localhost ~]# grep "aaaaaa*" test #匹配最少包含五个连续a的字符串
aaaaa5
“.” 匹配除了换行符外任意一个字符
[root@localhost ~]# vim test
[root@localhost ~]# cat test
a1
aa2
aaa3
aaaa4
aaaaa5
seed seeed s d
s!@#d s123d
sd
[root@localhost ~]# grep "s..d" test #“s..d”会匹配在s和d这两个字母之间一定有两个字符的内容
seed seeed s d
[root@localhost ~]# grep "s.*d" test #匹配在s和d字母之间有任意字符
seed seeed s d
s!@#d s123d
sd
[root@localhost ~]# grep ".*" test #匹配所有内容
a1
aa2
aaa3
aaaa4
aaaaa5
seed seeed s d
s!@#d s123d
sd
“^”匹配行首,“$”匹配行尾
root@localhost ~]# grep "^a" test #匹配以字母"a"开头的行
a1
aa2
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "d$" test #匹配以字母"d"结尾的行
seed seeed s d
s!@#d s123d
sd
[root@localhost ~]# grep -n "^$" test #匹配空白行
4:
7:
“[]”匹配中括号中指定的任意一个字符,只匹配一个字符
[root@localhost ~]# vim test
[root@localhost ~]# cat test
a1
aa2
aaa3
aaaa4
aaaaa5
seed seeed s d
s!@#d s123d
sd
saoid
said
soid
s1d
s2d
ABS
Abb
1
2
7
8
24
91
[root@localhost ~]# grep "s[ao]id" test #匹配s和i字母中,要么是a、要么是o
said
soid
[root@localhost ~]# grep "[0-9]" test #匹配任意一个数字
a1
aa2
aaa3
aaaa4
aaaaa5
s!@#d s123d
s1d
s2d
1
2
7
8
24
91
[root@localhost ~]# grep "^[a-z]" test #匹配以小写字母开头的行
a1
aa2
aaa3
aaaa4
aaaaa5
seed seeed s d
s!@#d s123d
sd
saoid
said
soid
s1d
s2d
[root@localhost ~]# grep "^[A-Z]" test #匹配以大写字母开头的行
ABS
Abb
“[^]”匹配除中括号的字符以外的任意一个字符
[root@localhost ~]# grep "^[^a-z]" test #匹配不以小写字母开头的行
ABS
Abb
1
2
7
8
24
91
[root@localhost ~]# grep "^[^a-zA-Z]" test #匹配不以字母开头的行
1
2
7
8
24
91
“\” 转义符
[root@localhost ~]# grep . test
a1
aa2
aaa3
aaaa4
aaaaa5
seed seeed s d
s!@#d s123d
sd
saoid
said
soid
s1d
s2d
ABS
Abb
1
2
7
8
24
91
[root@localhost ~]# vim test
[root@localhost ~]# cat test
a1
aa2
aaa3
aaaa4
aaaaa5
seed seeed s d.
s!@#d s123d.
sd.
saoid
said
soid
s1d
s2d
ABS
Abb
1
2
7
8
24
91
[root@localhost ~]# grep "\.$" test
seed seeed s d.
s!@#d s123d.
sd.
“\{n\}” 表示其前面的字符恰好出现n次
[root@localhost ~]# grep "a\{3\}" test #匹配a字母连续出现三次的字符串
aaa3
aaaa4
aaaaa5
[root@localhost ~]# grep "[0-9]\{2\}" test #匹配包含连续的两个数字的字符串
s!@#d s123d.
24
91
“\{n,\}”表示其前面的字符出现不小于n次
[root@localhost ~]# vim test
[root@localhost ~]# cat test
the a1
the aa2
the aaa3
the aaaa4
the aaaaa5
google
gooogle
goooogle
oogle
ooogle
oooogle
12saoid
123said
1234soid
12345s1d
[root@localhost ~]# grep "^[0-9]\{3,\}[a-z]" test #匹配至少以连续三个数字开头的行
123said
1234soid
12345s1d
[root@localhost ~]# grep "^[0-9]\{4,\}[a-z]" test #匹配至少以连续四个数字开头的行
1234soid
12345s1d
“\{n,m\}”匹配其前面的字符至少出现n次,最多出现m次。
[root@localhost ~]# vim test
[root@localhost ~]# cat test
the a1
the aa2
the aaa3
the aaaa4
the aaaaa5
google
gooogle
goooogle
oogle
ooogle
oooogle
abc
abbc
abbbc
adc
addc
ac
[root@localhost ~]# grep "ab\{1,3\}c" test #匹配在字母a和c
abc
abbc
abbbc
反向选择
[root@localhost ~]# grep -vn "the" test
#取不以“the”开头的行并显示行号
6:
7:google
8:gooogle
9:goooogle
10:oogle
11:ooogle
12:oooogle
13:
14:abc
15:abbc
16:abbbc
17:adc
18:addc
19:ac
[root@localhost ~]# grep -v "the" test
#取不以“the”开头的行
google
gooogle
goooogle
oogle
ooogle
oooogle
abc
abbc
abbbc
adc
addc
ac