sed 全名叫 stream editor,流编辑器,用程序的方式来编辑文本。
用 s 命令替换
我使用下面的这段文本做演示:
$ cat pets.txt
This is my cat
my cat's name is betty
This is my dog
my dog's name is frank
This is my fish
my fish's name is george
This is my goat
my goat's name is adam
把其中的 my 字符串替换成 Hao Chen's,下面的语句应该很好理解(s 表示替换命令,/my / 表示匹配 my,/Hao Chen's / 表示把匹配替换成 Hao Chen's,/g 表示一行上的替换所有的匹配):
$ sed "s/my/Hao Chen's/g" pets.txt
This is Hao Chen's cat
Hao Chen's cat's name is betty
This is Hao Chen's dog
Hao Chen's dog's name is frank
This is Hao Chen's fish
Hao Chen's fish's name is george
This is Hao Chen's goat
Hao Chen's goat's name is adam
注意:如果你要使用单引号,那么你没办法通过 \’这样来转义,就有双引号就可以了,在双引号内可以用 \” 来转义。
再注意:上面的 sed 并没有对文件的内容改变,只是把处理过后的内容输出,如果你要写回文件,你可以使用重定向,如:
$ sed "s/my/Hao Chen's/g" pets.txt > hao_pets.txt
或使用 -i 参数直接修改文件内容:
$ sed -i "s/my/Hao Chen's/g" pets.txt
在每一行最前面加点东西:
$ sed 's/^/#/g' pets.txt
#This is my cat
# my cat's name is betty
#This is my dog
# my dog's name is frank
#This is my fish
# my fish's name is george
#This is my goat
# my goat's name is adam
在每一行最后面加点东西:
$ sed 's/$/ --- /g' pets.txt
This is my cat ---
my cat's name is betty ---
This is my dog ---
my dog's name is frank ---
This is my fish ---
my fish's name is george ---
This is my goat ---
my goat's name is adam ---
顺手介绍一下正则表达式的一些最基本的东西:
^ 表示一行的开头。如:/^#/ 以 #开头的匹配。
$ 表示一行的结尾。如:/}$/ 以} 结尾的匹配。
< 表示词首。 如 <abc 表示以 abc 为首的詞。
> 表示词尾。 如 abc> 表示以 abc 結尾的詞。
. 表示任何单个字符。
- 表示某个字符出现了 0 次或多次。
[] 字符集合。 如:[abc] 表示匹配 a 或 b 或 c,还有 [a-zA-Z] 表示匹配所有的 26 个字符。如果其中有 ^ 表示反,如[^ a]
(这里应该去掉空格, ^ 和 a 连在一起) 表示非 a 的字符
正规则表达式是一些很牛的事,比如我们要去掉某 html 中的 tags:
<b>This</b> is what <span style="text-decoration: underline;">I</span> meant. Understand?
看看我们的 sed 命令
$ sed 's/<.*>//g' html.txt
Understand?
如果你这样搞的话,就会有问题,要解决上面的那个问题,就得像下面这样。其中的 [^ >]
(这里应该去掉空格, ^ 和 > 连在一起) 指定了除了>的字符重复0次或多次。
$ sed 's/<[^>]*>//g' html.txt
This is what I meant. Understand?
我们再来看看指定需要替换的内容:
$ sed "3s/my/your/g" pets.txt
This is my cat
my cat's name is betty
This is your dog
my dog's name is frank
This is my fish
my fish's name is george
This is my goat
my goat's name is adam
下面的命令只替换第 3 到第 6 行的文本。
$ sed "3,6s/my/your/g" pets.txt
This is my cat
my cat's name is betty
This is your dog
your dog's name is frank
This is your fish
your fish's name is george
This is my goat
my goat's name is adam
只替换每一行的第一个 s:
$ cat my.txt
This is my cat, my cat's name is betty
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
This is my goat, my goat's name is adam
$ sed 's/s/S/1' my.txt
ThiS is my cat, my cat's name is betty
ThiS is my dog, my dog's name is frank
ThiS is my fish, my fish's name is george
ThiS is my goat, my goat's name is adam
只替换每一行的第二个 s:
$ sed 's/s/S/2' my.txt
This iS my cat, my cat's name is betty
This iS my dog, my dog's name is frank
This iS my fish, my fish's name is george
This iS my goat, my goat's name is adam
只替换第一行的第 3 个以后的 s:
$ sed 's/s/S/3g' my.txt
This is my cat, my cat'S name iS betty
This is my dog, my dog'S name iS frank
This is my fiSh, my fiSh'S name iS george
This is my goat, my goat'S name iS adam
多个匹配
如果我们需要一次替换多个模式,可参看下面的示例:(第一个模式把第一行到第三行的 my 替换成 your,第二个则把第 3 行以后的 This 替换成了 That)
$ sed '1,3s/my/your/g; 3,$s/This/That/g' my.txt
This is your cat, your cat's name is betty
This is your dog, your dog's name is frank
That is your fish, your fish's name is george
That is my goat, my goat's name is adam
上面的命令等价于:(注:下面使用的是 sed 的 - e 命令行参数)
sed -e '1,3s/my/your/g' -e '3,$s/This/That/g' my.txt
我们可以使用 & 来当做被匹配的变量,然后可以在基本左右加点东西。如下所示:
$ sed 's/my/[&]/g' my.txt
This is [my] cat, [my] cat's name is betty
This is [my] dog, [my] dog's name is frank
This is [my] fish, [my] fish's name is george
This is [my] goat, [my] goat's name is adam
圆括号匹配
使用圆括号匹配的示例:(圆括号括起来的正则表达式所匹配的字符串会可以当成变量来使用,sed 中使用的是 \ 1,\2…)
$ sed 's/This is my \([^,]*\),.*is \(.*\)/\1:\2/g' my.txt
cat:betty
dog:frank
fish:george
goat:adam
上面这个例子中的正则表达式有点复杂,解开如下(去掉转义字符):
正则为:This is my ([^,]*),.*is (.*)
匹配为:This is my (cat),……….is (betty)
然后:\1 就是 cat,\2 就是 betty