2021-06-16 正则表达grep

  • 正规表达法是处理字符串的方法,以行为单位,进行字符串的处理,使用一些特殊符号的辅助,达到搜寻/取代特定字符串的目的。

grep

  • 进行字符串数据的比对,以整行为单位进行比对的。
 grep [-acinv] '搜寻字符串' filename
-a 将binary档案以test档案的方式进行搜寻数据
-c 计算找到‘搜寻字符串’ 的次数
-i 忽略大小写的不同
-n 顺便输出行号
-v 反向选择,显示没有自己想要搜寻的字符串的那一行

一、简单使用

(base) [root@localhost Orthogroups]# grep '10' OG.txt 
OG0010465
OG0010685
OG0010467
OG0010469
OG0010470
OG0010520
OG0001017
OG0001050
OG0001062
OG0003106
OG0010501
OG0010483
OG0015110
OG0000810
OG0001100
OG0010686
OG0000510
OG0010471
OG0001052
OG0016106
OG0001060
OG0001012
OG0001013
OG0001014
100DG.pep<16>:  
(base) [root@localhost test]# grep -n 'the' grep_test.txt 
8:I can't finish the test.
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
grep -n

二、-i 参数不区分大小写

(base) [root@localhost test]# grep -in 'the' grep_test.txt 
8:I can't finish the test.
9:Oh! The soup taste good.
12:the symbol '*' is represented as start.
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
-i 不区分大小写

三、利用 [] 来搜寻集合字符

搜寻testtaste

3.1 []里面的内容仅仅代表一个而已

grep -n 't[ea]st'  grep_test.txt
:I can't finish the test.
9:Oh! The soup taste good.
[]使用

3.2 [^]反向选择

3.2.1 搜寻oo

grep -n 'oo' grep_test.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
搜寻oo

3.2.2 不想要oo前面是g的时候

grep -n '[^g]oo' grep_test.txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!
[^]反向选择

3.2.3 oo前面不想要小写字母

grep -n '[^a-z]oo' grep_test.txt
3:Football game is not use feet only.
oo前面不要是小写字母

四、 搜寻数字

4.1 搜索文档中有数字一行

grep -n '[0-9' grep_test.txt
5:However, this dress is about $ 3183 dollars.
15:You are the best is mean you are the no. 1.
搜寻有数字的

4.2搜寻有字母大小写以及有数字的行

grep -n '[a-zA-Z0-9]' grep_test.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
8:I can't finish the test.
9:Oh! The soup taste good.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh!  My god!
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
字母大小写以及数字

五、 行首与行尾字符^$

- 注意^[]里面以及没有[]的区别,有[]是指反向选择,没有[]就是表示在行首是xx

#  the 在行首
grep -n '^the' grep_test.txt
12:the symbol '*' is represented as start.
^的使用^the在行首

5.1 以.结尾的行

  • 跳脱字符()

grep -n '\.$' grep_test.txt
12:the symbol '*' is represented as start.
(base) [root@localhost test]# grep -n '\.$' grep_test.txt 
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
8:I can't finish the test.
9:Oh! The soup taste good.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
20:go! go! Let's go.
.$ 以.结尾的行

cat -A grep_test.txt(是什么意思啊。后面看)

5.2 输出空白行

grep -n '^$' grep_test.txt

5.3 省略掉注释的#行和空白行

grep -v '^$' grep_test.txt | grep -v #"

六、 egrep 延申正规表示法

#等同于上面的
egrep -v '^$|^#' grep_test.txt

6.1 延申表达的特殊符号

+ 重复一个或多的的前一个字符
? 零个或一个前面的字符
|  或
() 找出群组字符串 'g(la|oo)d' 即glad或good
()还可以用来作多个重复群组来判别
echo 'AxyzxyzxyzxyzC' | egrp 'A(xyz)+C'  找出开头A结尾C并xyz在中间的字符串
. 代表绝对有一个字符
* 在正规表达中并不是通配符的意思,而是重复前一个0到无穷次的意思,为组合形态
{} 限定范围的字符

!在正规表达中并不是特殊字符,记住反向选择[^]

* 不是通配符而是0到无穷个前一个字符

{}在shell里面是与的特殊含义,所以要加脱节符

\ 跳脱字符

#找出含有’的字符行
grep -n \' test.txt 

6.2 找出!和>的字行

grep -n '[!>]' grep_test.txt
不要空白行不要以# 开头的行
grep -v '^$' test.txt | grep -v '^#'

例子1
开头是g二结尾是的,并且是有四个字符的

grep  'g..d' test.txt

在打印需要有两个以上个o字符

grep 'ooo*’ test.txt

同理,一个以上的o字符就应该是

grep 'oo*' test.txt

找出开头是g并且结尾是g的字符

grep 'g.*g' test.txt
# 注意#是0个或者是无穷个前面的字符

找出含有任意数字的行

grep '[0-9][0-9]*' test.txt
#当然了下面的这个也可以达到目的
grep -n '[0-9]' test.txt

找两个o的字符串

grep -n 'o\{2\}' test.txt
#与下面的同
grep 'ooo* test.txt

找出g后面接2-5个o后面又接g

grep -n 'go\{2,5\} g'  test.txt

g加上两个以上oo后面接g

grep 'go\{2,\}g' test.txt
#同
grep 'gooo*g' test.txt
正规表达汇总
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容