前言:

  • 这段时间在写爬虫项目时,需要代理池绕过反爬,但是看了一下没有便宜一点的代理商家,于是想起来很久以前看到的通过使用云厂商提供的云函数服务来实现代理池的文章,于是就试了一下,发现目前只有阿里云可以实现,但是阿里云变换IP很慢,发很多次请求都用的是同一个IP,但是腾讯云(本文使用Web函数搭建)和华为云(本文使用华为新版云函数搭建)目前不支持API网关触发,所以理论不能使用了,这篇文章就是介绍如何可以在这三个云厂商中都能搭建代理池
  • 利用云函数搭建代理池原因:就是正常搭建的云函数如果要访问互联网,它是可以设置为没有固定公网IP地址,当云函数每次都触发且需要访问互联网时,都会给它一个闲置的随机IP,所以就可以实现每次触发函数时都有一个不同的IP,于是就可以搭建代理池了

1 阿里云

这边我也给出函数代码,可以用这个代码直接搭建

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# -*- coding: utf8 -*-
import json
from base64 import b64decode, b64encode

import urllib3
urllib3.disable_warnings()


def handler(environ: dict, start_response):
try:
request_body_size = int(environ.get('CONTENT_LENGTH', 0))
except (ValueError):
request_body_size = 0
request_body = environ['wsgi.input'].read(request_body_size)

kwargs = json.loads(request_body.decode("utf-8"))
kwargs['body'] = b64decode(kwargs['body'])

http = urllib3.PoolManager(cert_reqs="CERT_NONE")
# Prohibit automatic redirect to avoid network errors such as connection reset
r = http.request(**kwargs, retries=False, decode_content=False)

response = {
"headers": {k.lower(): v.lower() for k, v in r.headers.items()},
"status_code": r.status,
"content": b64encode(r._body).decode('utf-8')
}

status = '200 OK'
response_headers = [('Content-type', 'text/json')]
start_response(status, response_headers)
return [json.dumps(response)]

2 腾讯云

2.1 点击函数服务

image-20250405225900521

2.2 新建函数

  • 如下填写

image-20250405230352454

  • 函数代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# -*- coding: utf-8 -*-
import json
from base64 import b64encode, b64decode
import time
import urllib3
from flask import Flask, request, jsonify

urllib3.disable_warnings()
app = Flask(__name__)


@app.route("/proxy", methods=["POST"])
def proxy_request():
try:
request_data = request.get_json()

# 检查必要参数
body = b64decode(request_data.get('body', '')).decode() if 'body' in request_data else None,
# 解析请求参数
required_fields = ['method', 'url']
if not all(field in request_data for field in required_fields):
raise ValueError(f"请求必须包含字段: {required_fields}")

# 2. 准备请求参数
kwargs = {
'method': request_data['method'].upper(),
'url': request_data['url'],
'headers': request_data.get('headers', {}),
'body': body,
'timeout': urllib3.Timeout(connect=5.0, read=10.0)
}

# 3. 发送请求(禁用SSL验证仅用于测试,生产环境应配置证书)
http = urllib3.PoolManager(cert_reqs="CERT_NONE")
start_time = time.time()
response = http.request(**kwargs)
elapsed = time.time() - start_time

# 4. 构造响应
proxy_response = {
"status_code": response.status,
"headers": {k.lower(): v for k, v in response.headers.items()},
"body": b64encode(response.data).decode('utf-8') if response.data else "",
"elapsed": f"{elapsed:.3f}s"
}

return jsonify(proxy_response)

except Exception as e:
proxy_response = {
"status_code": 500,
"headers": {},
"body": json.dumps({"error": str(e)}),
"elapsed": "0s"
}
return jsonify(proxy_response)

if __name__ == "__main__":
app.run(host="0.0.0.0", port=9000, debug=True)

image-20250405230535069

image-20250405230613642

2.3 新建终端

image-20250405231037280

2.4 安装依赖包

1
pip install urllib3==1.26.18 -t .

image-20250405231445598

2.5 查看URL

image-20250405232636827

2.6 测试

image-20250405232439629

2.7 base64解码

image-20250405232521351

2.8 使用python调用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import json
import base64
import random
import string

import requests
import urllib3
url = "https://bigfish.ap-beijing.tencentscf.com/proxy"

# 需要发送的数据
data = {
"method": "POST",
"url": "https://ipinfo.io",
"headers": {"Content-Type": "application/json"},
"body": base64.b64encode(json.dumps({"username": "123456"}).encode()).decode() # Base64 编码 body
# "body": "eyJ1c2VybmFtZSI6ICIxMjM0NTYifQ=="
}
# 发送 POST 请求
response = requests.post(url, json=data)
# 解析返回的数据
if response.status_code == 200:
res_json = response.json()
decoded_content = base64.b64decode(res_json["body"]).decode()
print("解码后的内容:", decoded_content)
else:
print(f"请求失败,状态码: {response.status_code}")
print("返回内容:", response.text)

f72c9d6895d73bb4fe23cdc02287a5f3_720

3 华为云

3.1 新建函数

image-20250406000326532

image-20250406000411247

3.2 上传代码

  • 代码地址
1
2
3
通过网盘分享的文件:hwy.zip
链接: https://pan.baidu.com/s/1hrsrshlortj_KaJNpdOjSQ 提取码: bxah
--来自百度网盘超级会员v2的分享
  • 如果想自己制作文件,请查看后面备注

image-20250406000457816

3.3 测试

image-20250406001007820

3.4 发布版本

  • 好像发布版本之后才能使用url

image-20250406001052350

3.5 查看URL

image-20250406001141231

3.6 URL测试

image-20250406002310382

3.7 使用python调用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import json
import base64
import random
import string

import requests
import urllib3
url = "https://bigfish.ap-beijing.tencentscf.com/proxy"

# 需要发送的数据
data = {
"method": "POST",
"url": "https://ipinfo.io",
"headers": {"Content-Type": "application/json"},
"body": base64.b64encode(json.dumps({"username": "123456"}).encode()).decode() # Base64 编码 body
# "body": "eyJ1c2VybmFtZSI6ICIxMjM0NTYifQ=="
}
# 发送 POST 请求
response = requests.post(url, json=data)
# 解析返回的数据
if response.status_code == 200:
res_json = response.json()
decoded_content = base64.b64decode(res_json["body"]).decode()
print("解码后的内容:", decoded_content)
else:
print(f"请求失败,状态码: {response.status_code}")
print("返回内容:", response.text)

备注

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# -*- coding: utf8 -*-
from pythonRT import LogLevel, Context
import json
from base64 import b64decode, b64encode
import requests
import urllib3
urllib3.disable_warnings()
import time

def test(environ: dict, start_response):
resp = requests.get("http://ipinfo.io").json()
return resp

def myHandler(event: dict, context: Context):
try:
request_data = event

# 检查必要参数
body = b64decode(request_data.get('body', '')).decode() if 'body' in request_data else None
# 解析请求参数
required_fields = ['method', 'url']
if not all(field in request_data for field in required_fields):
raise ValueError(f"请求必须包含字段: {required_fields}")

# 2. 准备请求参数
kwargs = {
'method': request_data['method'].upper(),
'url': request_data['url'],
'headers': request_data.get('headers', {}),
'body': body,
'timeout': urllib3.Timeout(connect=5.0, read=10.0)
}

# 3. 发送请求(禁用SSL验证仅用于测试,生产环境应配置证书)
http = urllib3.PoolManager(cert_reqs="CERT_NONE")
start_time = time.time()
response = http.request(**kwargs)
elapsed = time.time() - start_time

# 4. 构造响应
proxy_response = {
"status_code": response.status,
"headers": {k.lower(): v for k, v in response.headers.items()},
"body": b64encode(response.data).decode('utf-8') if response.data else "",
"elapsed": f"{elapsed:.3f}s"
}

return proxy_response

except Exception as e:
return {
"status_code": 500,
"headers": {},
"body": json.dumps({"error": str(e)}),
"elapsed": "0s"
}