阿布云对爬虫新手比较友好,通过购买之后可以生成通行证书以及通行密钥,可以选取http隧道socks隧道,以及专业版,经典版,动态版进行生成,在接入文档中有你需要选择的ip代理池使用信息,php-python:
我选取了单py和scrapy框架处理的方式:
import requests
# 要访问的目标页面
targetUrl = "http://test.abuyun.com"
# 代理服务器
proxyHost = "http-dyn.abuyun.com"
proxyPort = "9020"
# 代理隧道验证信息通行证书以及通行密钥
proxyUser = "************"
proxyPass = "************"
proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s" % {
"host" : proxyHost,
"port" : proxyPort,
"user" : proxyUser,
"pass" : proxyPass,
}
proxies = {
"http" : proxyMeta,
"https" : proxyMeta,
}
resp = requests.get(targetUrl, proxies=proxies)
print(resp.status_code)
print(resp.text)
scrapy框架:
import base64 # 代理服务器 proxyServer = "http://http-dyn.abuyun.com:9020" # 代理隧道验证信息 proxyUser = "H01234567890123D" proxyPass = "0123456789012345" # for Python2 proxyAuth = "Basic " + base64.b64encode(proxyUser + ":" + proxyPass) # for Python3 #proxyAuth = "Basic " + base64.urlsafe_b64encode(bytes((proxyUser + ":" + proxyPass), "ascii")).decode("utf8") class ProxyMiddleware(object): def process_request(self, request, spider): request.meta["proxy"] = proxyServer request.headers["Proxy-Authorization"] = proxyAuth
--以上取材来自阿布云代理,无任何商用,自认为比较方便好用的代理软件,望采用,谢谢(qq:858703032)