使用shell统计出出现次数排名top10的网址 - 小数据 - 博客频道 - CSDN.NET
http://blog.csdn.net/guaguastd/article/details/8332757
!/bin/sh
foo()
{
if [ $# -ne 1 ];
then
echo "Usage:$0 filename";
exit -1
fi
egrep -o "http://[a-zA-Z0-9.]+.[a-zA-Z]{2,3}" website | awk '{ count[$0]++ } END { printf("%-30s %s\n","wensite","count"); for(ind in count) { printf("%-30s %d\n",ind,count[ind]); } }' | sort -nrk 2 | head -n 10 >websorted2.txt;
}
foo website
例子:
文件website中的内容:
http://www.google.com
http://www.baidu.com
http://www.sina.com
http://www.bjtu.edu.cn
http://www.codeproject.com
http://www.csdn.com
http://www.sohu.com
http://www.yahoo.com
http://mail.163.com
http://www.bjtu.edu.cn
http://www.codeproject.com
http://www.csdn.com
http://www.sohu.com
http://www.yahoo.com
http://mail.163.com
http://www.codeproject.com
http://www.csdn.com
http://www.sohu.com
http://www.yahoo.com
http://mail.163.com
http://www.qq.com
http://www.hao123.com
http://www.163.com
http://youku.com
http://taobao/com
http://www.bjtu.edu.cn
http://www.codeproject.com
http://www.csdn.com
http://www.sohu.com
http://www.yahoo.com
http://mail.163.com
http://www.codeproject.com
http://www.csdn.com
http://www.sohu.com
http://www.yahoo.com
http://mail.163.com
http://www.qq.com
http://www.hao123.com
http://www.163.com
http://youku.com
http://taobao/com
生成的文件内容为(即结果)
http://www.yahoo.com 5
http://www.sohu.com 5
http://www.csdn.com 5
http://www.codeproject.com 5
http://mail.163.com 5
http://www.bjtu.edu.cn 3
http://youku.com 2
http://www.qq.com 2
http://www.hao123.com 2
http://www.163.com 2