只研究了抓取http的方式,需要抓取https的不用看了
最近工作需要抓取一些图片资源,需要使用fiddler和wireShark,对这两个抓包软件做一下总结,有助于两个软件更快上手。
fiddler
fidder抓取http协议很方便,但需配置http代理。浏览器和手机端配置代理很简单就不说了。有些设备可能用到adb来设置代理
adb主要命令
设置代理
adb shell settings delete global http_proxy
注意先执行这句,看看delete命令是否存在,免得之后代理难以清除
adb shell settings put global http_proxy 192.168.1.104:8888
清除代理
adb shell settings delete global http_proxy
adb shell settings delete global global_http_proxy_host
adb shell settings delete global global_http_proxy_port
抓取窗口的情况如下
使用fiddler的过滤器功能过滤图片url
之后copy所有url路径复制到txt文件中
之后利用Ecplise,编写java代码将url对应的图片下载到本地
可参考下面代码
public class Funs {
static int i = 1;
public static String downloadNetFile(String refFileURL, String refSavePath, String refFileType){
/* parameters:
* 1,refFileURL: is a url of file which exists in international (www)net;
* 2,refSavePath: is path of file which success download and save in it, eg.: e:\path;
* 3,refFileType: is a type of file, eg.: gif or jpg or png or htm or html or ppt or doc or...
*
* return type:
* data type: String, only return a file name or a empty String, not include path
* */
String retValue = "";
if(null == refFileURL || refFileURL.trim().isEmpty()
|| null == refSavePath || refSavePath.trim().isEmpty()
|| null == refFileType || refFileType.trim().isEmpty()){
retValue = "";
}else{
//default value of fileName
String fileName = "down_" + new SimpleDateFormat("yyyyMMdd_HHmmssSSS").format(Calendar.getInstance().getTime()) +
"_" + new Random().nextInt(1000);
String fileType = refFileType.trim();
if(!fileType.startsWith(".")){
fileType = "." + fileType;
}
fileName = fileName + fileType;
String savePath = refSavePath.trim();
if(!savePath.endsWith("\\")){
savePath = savePath + "\\";
}
String saveDownFile = savePath + fileName;
String fileURL = refFileURL.trim();
try {
//create URL object
URL fileUrl = new URL(fileURL);
//connect to net URL resource and create HttpURLConnection object
HttpURLConnection connection = (HttpURLConnection) fileUrl.openConnection();
//also can to use BufferedInputStream and BufferedOutputStream object
//get net source input stream
DataInputStream ins = new DataInputStream(connection.getInputStream());
//rewrite the new file content if exists same file
DataOutputStream out = new DataOutputStream(new FileOutputStream(saveDownFile,false));
byte[] buffer = new byte[4096];
int count = 0;
while ((count = ins.read(buffer)) > 0){
out.write(buffer, 0, count);
}
out.close(); //close dataInputStream and release resource
ins.close(); //close dataOutputStream and release resource
connection.disconnect(); //close net download stream
retValue = fileName;
} catch (Exception e) {
retValue = "";
}
savePath = null;
fileURL = null;
fileType = null;
}
return retValue;
}
public static void main(String[] args) {
// String refFileURL = "http://puui.qpic.cn/vcover_vt_pic/0/mzc00200y8o6ry91588759762579/260";
String refSavePath = "E:\\保存路径";
String refFileType = "jpg";
//downloadNetFile(refFileURL,refSavePath,refFileType);
List<String> list = readDocUrl("E:\\txt文件名");
System.out.println(list.size());
for(String s : list){
System.out.println(s);
downloadNetFile(s,refSavePath,refFileType);
}
}
//读取Doc中的url;
public static List<String> readDocUrl(String docPath){
List<String> list = new ArrayList<>();
InputStreamReader ir = null;
BufferedReader br = null;
String s = null;
// 创建缓冲区阅读器从键盘逐行读入数据
try {
ir = new InputStreamReader(new FileInputStream(docPath+".txt"));
br = new BufferedReader(ir);
// 读一行数据,并标准输出至显示器
// readLine()方法运行时若发生I/O错误,将抛出IOException异常
StringBuilder sb = new StringBuilder();
while ((s=br.readLine())!= null) {
if(s.startsWith("https")||s.startsWith("http")){
String result = sb.toString().trim();
if(!result.isEmpty()) list.add(result);
//System.out.println(result);
// System.out.println(s);
sb.delete(0,sb.length());
sb.append(s);
}else{
sb.append(s);
// }
//
//
}
//
}
String result = sb.toString().trim();
if(!result.isEmpty()) list.add(result);
// 关闭缓冲阅读器
ir.close();
br.close();
} catch (IOException e) { // Catch any IO exceptions.
e.printStackTrace();
}
return list;
}
设置好放置url的txt文件和图片保存路径即可执行输出
WireShark
下面在介绍以下使用WireShark的抓取,网上可参考的内容很多,本文主要介绍要注意的几点。
1 远程设备最方便的是笔记本电脑开启wifi热点
给要抓取的设备,wireshark网卡选择本地连接
2 第一次进入WireShark可能找不到网卡
可参考:解决wireshark检测不到网卡的问题
wireshark安装 net start npf 服务器名无效
3 Wireshark过滤http请求
Wireshark可抓取多种协议信息,需过滤http
4 如何像上图一样只显示request url完整路径
默认wireshark的抓取窗口会输出多个列,我们可能只关心url路径
可按图顺序,将Full Request Uri应用为列,然后在③位置右键去掉不需要列的勾选,只保留Full Request Uri,最终如图效果。
最后将Full Request Uri,Ctrl+A,Ctrl+C 复制到txt文件中,去掉空行后,就可以执行下载程序将图片下载到本地了:)。
本文就先写到这里吧