- 正规表达法是处理字符串的方法,以行为单位,进行字符串的处理,使用一些特殊符号的辅助,达到搜寻/取代特定字符串的目的。
grep
- 进行字符串数据的比对,以整行为单位进行比对的。
grep [-acinv] '搜寻字符串' filename
-a 将binary档案以test档案的方式进行搜寻数据
-c 计算找到‘搜寻字符串’ 的次数
-i 忽略大小写的不同
-n 顺便输出行号
-v 反向选择,显示没有自己想要搜寻的字符串的那一行
一、简单使用
(base) [root@localhost Orthogroups]# grep '10' OG.txt
OG0010465
OG0010685
OG0010467
OG0010469
OG0010470
OG0010520
OG0001017
OG0001050
OG0001062
OG0003106
OG0010501
OG0010483
OG0015110
OG0000810
OG0001100
OG0010686
OG0000510
OG0010471
OG0001052
OG0016106
OG0001060
OG0001012
OG0001013
OG0001014
100DG.pep<16>:
(base) [root@localhost test]# grep -n 'the' grep_test.txt
8:I can't finish the test.
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
二、-i 参数不区分大小写
(base) [root@localhost test]# grep -in 'the' grep_test.txt
8:I can't finish the test.
9:Oh! The soup taste good.
12:the symbol '*' is represented as start.
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
三、利用 [] 来搜寻集合字符
搜寻test和taste
3.1 []里面的内容仅仅代表一个而已
grep -n 't[ea]st' grep_test.txt
:I can't finish the test.
9:Oh! The soup taste good.
3.2 [^]反向选择
3.2.1 搜寻oo
grep -n 'oo' grep_test.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
3.2.2 不想要oo前面是g的时候
grep -n '[^g]oo' grep_test.txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!
3.2.3 oo前面不想要小写字母
grep -n '[^a-z]oo' grep_test.txt
3:Football game is not use feet only.
四、 搜寻数字
4.1 搜索文档中有数字一行
grep -n '[0-9' grep_test.txt
5:However, this dress is about $ 3183 dollars.
15:You are the best is mean you are the no. 1.
4.2搜寻有字母大小写以及有数字的行
grep -n '[a-zA-Z0-9]' grep_test.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
8:I can't finish the test.
9:Oh! The soup taste good.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
五、 行首与行尾字符^$
- 注意^在[]里面以及没有[]的区别,有[]是指反向选择,没有[]就是表示在行首是xx
# the 在行首
grep -n '^the' grep_test.txt
12:the symbol '*' is represented as start.
5.1 以.结尾的行
-
跳脱字符()
grep -n '\.$' grep_test.txt
12:the symbol '*' is represented as start.
(base) [root@localhost test]# grep -n '\.$' grep_test.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
8:I can't finish the test.
9:Oh! The soup taste good.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
20:go! go! Let's go.
cat -A grep_test.txt(是什么意思啊。后面看)
5.2 输出空白行
grep -n '^$' grep_test.txt
5.3 省略掉注释的#行和空白行
grep -v '^$' grep_test.txt | grep -v #"
六、 egrep 延申正规表示法
#等同于上面的
egrep -v '^$|^#' grep_test.txt
6.1 延申表达的特殊符号
+ 重复一个或多的的前一个字符
? 零个或一个前面的字符
| 或
() 找出群组字符串 'g(la|oo)d' 即glad或good
()还可以用来作多个重复群组来判别
echo 'AxyzxyzxyzxyzC' | egrp 'A(xyz)+C' 找出开头A结尾C并xyz在中间的字符串
. 代表绝对有一个字符
* 在正规表达中并不是通配符的意思,而是重复前一个0到无穷次的意思,为组合形态
{} 限定范围的字符
!在正规表达中并不是特殊字符,记住反向选择[^]
* 不是通配符而是0到无穷个前一个字符
{}在shell里面是与的特殊含义,所以要加脱节符
\ 跳脱字符
#找出含有’的字符行
grep -n \' test.txt
6.2 找出!和>的字行
grep -n '[!>]' grep_test.txt
不要空白行不要以# 开头的行
grep -v '^$' test.txt | grep -v '^#'
例子1
开头是g二结尾是的,并且是有四个字符的
grep 'g..d' test.txt
在打印需要有两个以上个o字符
grep 'ooo*’ test.txt
同理,一个以上的o字符就应该是
grep 'oo*' test.txt
找出开头是g并且结尾是g的字符
grep 'g.*g' test.txt
# 注意#是0个或者是无穷个前面的字符
找出含有任意数字的行
grep '[0-9][0-9]*' test.txt
#当然了下面的这个也可以达到目的
grep -n '[0-9]' test.txt
找两个o的字符串
grep -n 'o\{2\}' test.txt
#与下面的同
grep 'ooo* test.txt
找出g后面接2-5个o后面又接g
grep -n 'go\{2,5\} g' test.txt
g加上两个以上oo后面接g
grep 'go\{2,\}g' test.txt
#同
grep 'gooo*g' test.txt