golang net/http标准库的client是可以配置各种代理的,http/https/sock5等,不过fasthttp仅支持配置sock5代理,通过定义fasthttp dialfunc实现:
c := &fasthttp.Client{
Dial: fasthttpproxy.FasthttpSocksDialer("localhost:9050"),
}
项目中碰到的问题是,ops只提供了用squid搭建的http代理,所以是想重新定义一个http代理的dialfunc,找了fasthttp github仓库的issue,作者提供了一个dialFunc
https://github.com/valyala/fasthttp/issues/363#issuecomment-417868528
func FasthttpHTTPDialer(proxyAddr string) fasthttp.DialFunc {
return func(addr string) (net.Conn, error) {
conn, err := fasthttp.Dial(proxyAddr)
if err != nil {
return nil, err
}
req := "CONNECT " + addr + " HTTP/1.1\r\n"
// req += "Proxy-Authorization: xxx\r\n"
req += "\r\n"
if _, err := conn.Write([]byte(req)); err != nil {
return nil, err
}
res := fasthttp.AcquireResponse()
defer fasthttp.ReleaseResponse(res)
res.SkipBody = true
if err := res.Read(bufio.NewReader(conn)); err != nil {
conn.Close()
return nil, err
}
if res.Header.StatusCode() != 200 {
conn.Close()
return nil, fmt.Errorf("could not connect to proxy")
}
return conn, nil
}
}
c := &fasthttp.Client{
Dial: FasthttpHTTPDialer("localhost:9050"),
}
经测试,访问https的站点是OK的,访问http的站点不行,代理连接不上。先说一下http/https代理的区别,再说原因。
HTTPS
xx@DESKTOP-TD3VVD0:~$ curl -x 127.0.0.1:1080 https://www.google.com -I -v
* Rebuilt URL to: https://www.google.com/
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 1080 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to www.google.com:443
> CONNECT www.google.com:443 HTTP/1.1
> Host: www.google.com:443
> User-Agent: curl/7.58.0
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 Connection established
HTTP/1.1 200 Connection established
<
> HEAD / HTTP/2
> Host: www.google.com
> User-Agent: curl/7.58.0
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200
HTTP/2 200
< date: Sat, 27 Apr 2019 12:06:57 GMT
date: Sat, 27 Apr 2019 12:06:57 GMT
< expires: -1
expires: -1
< cache-control: private, max-age=0
cache-control: private, max-age=0
< content-type: text/html; charset=ISO-8859-1
使用代理访问https网站时,会先发CONNECT请求,让代理与目标站点建立一个http tunnel,之后在这个tunnel基础上进行传输,对应到上面的dialFunc过程就是:
- 客户端与代理建立一条tcp连接
- 通过这条连接向代理发出CONNECT请求,让代理和目标站点google建立一条http tunnel,代理返回 HTTP/1.1 200 Connection established,这个是和普通的http请求返回是不一样的,CONNECT方法专属返回
- 之后就可以通过这条tcp连接进行请求了
HTTP
xx@DESKTOP-TD3VVD0:~$ curl -x 127.0.0.1:1080 http://www.baidu.com -v -I
* Rebuilt URL to: http://www.baidu.com/
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 1080 (#0)
> HEAD http://www.baidu.com/ HTTP/1.1
> Host: www.baidu.com
> User-Agent: curl/7.58.0
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: bfe/1.0.8.18
Server: bfe/1.0.8.18
< Date: Sat, 27 Apr 2019 12:15:06 GMT
Date: Sat, 27 Apr 2019 12:15:06 GMT
< Content-Type: text/html
Content-Type: text/html
< Content-Length: 277
Content-Length: 277
< Last-Modified: Mon, 13 Jun 2016 02:50:23 GMT
Last-Modified: Mon, 13 Jun 2016 02:50:23 GMT
< Connection: Close
Connection: Close
< ETag: "575e1f6f-115"
ETag: "575e1f6f-115"
< Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
< Pragma: no-cache
Pragma: no-cache
< Accept-Ranges: bytes
Accept-Ranges: bytes
< Proxy-Connection: keep-alive
Proxy-Connection: keep-alive
<
* Closing connection 0
可以看到http站点是不需要发CONNECT请求的,而是直接将目标站点的url作为path 填写在http请求头中。
原因
为何上面的dialfunc访问http站点不行呢,查了squid代理的文档,发现squid默认会禁止非https站点通过CONNECT方法建立通道,自己搭了个squid代理去掉配置项,发现上面的dialfunc是可以访问http、https站点的,就是说http,https都先建立通过,再请求。
看了fasthttp的源码,没办法在请求前修改request header中的path为目标站点url,所以如果需要通过fasthttp使用http代理,那么可以使用上面的dialfunc,同时代理需要允许非443端口的站点可以建立通道。如果做不到这一点,那么还是建议使用标准库net/http的client,会更方便一点。