本文内容基于 设置请求超时。
熔断
熔断,是创建弹性微服务应用程序的重要模式。熔断能够使应用程序具备应对来自故障、潜在峰值和其他未知网络因素影响的能力。
- 部署
httpbin
服务:
应用程序 httpbin
作为后端服务。
cat samples/httpbin/httpbin.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: httpbin
---
apiVersion: v1
kind: Service
metadata:
name: httpbin
labels:
app: httpbin
service: httpbin
spec:
ports:
- name: http
port: 8000
targetPort: 80
selector:
app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
spec:
replicas: 1
selector:
matchLabels:
app: httpbin
version: v1
template:
metadata:
labels:
app: httpbin
version: v1
spec:
serviceAccountName: httpbin
containers:
- image: docker.io/kennethreitz/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
kubectl apply -f samples/httpbin/httpbin.yaml
kubectl get pod httpbin-66cdbdb6c5-9w4p6
NAME READY STATUS RESTARTS AGE
httpbin-66cdbdb6c5-9w4p6 2/2 Running 0 5m23s
- 配置熔断器:
创建一个 DestinationRule
,在调用 httpbin
服务时应用熔断设置。
如果 Istio 启用了双向 TLS 身份验证,则必须在应用目标规则之前将 TLS 流量策略
mode:ISTIO_MUTUAL
添加到DestinationRule
。否则请求将产生 503 错误。
mkdir samples/httpbin/networking
vim samples/httpbin/networking/destination-rule-httpbin.yaml
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: httpbin
spec:
host: httpbin
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
consecutiveErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100
kubectl apply -f samples/httpbin/networking/destination-rule-httpbin.yaml
kubectl get dr httpbin
NAME HOST AGE
httpbin httpbin 3m42s
- 创建客户端程序:
增加一个客户,创建客户端程序以发送流量到 httpbin
服务。这是一个名为 fortio
的负载测试客户的,其可以控制连接数、并发数及发送 HTTP 请求的延迟。通过 fortio
能够有效的触发前面 在 DestinationRule
中设置的熔断策略。
cat samples/httpbin/sample-client/fortio-deploy.yaml
apiVersion: v1
kind: Service
metadata:
name: fortio
labels:
app: fortio
service: fortio
spec:
ports:
- port: 8080
name: http
selector:
app: fortio
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fortio-deploy
spec:
replicas: 1
selector:
matchLabels:
app: fortio
template:
metadata:
annotations:
sidecar.istio.io/statsInclusionPrefixes: cluster.outbound,cluster_manager,listener_manager,http_mixer_filter,tcp_mixer_filter,server,cluster.xds-grpc
labels:
app: fortio
spec:
containers:
- name: fortio
image: fortio/fortio:latest_release
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http-fortio
- containerPort: 8079
name: grpc-ping
kubectl apply -f samples/httpbin/sample-client/fortio-deploy.yaml
登入客户端 Pod
并使用 fortio
工具调用 httpbin
服务。-curl
参数表明发送一次调用:
FORTIO_POD=$(kubectl get pod | grep fortio | awk '{ print $1 }')
kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -curl http://httpbin:8000/get
HTTP/1.1 200 OK
server: envoy
date: Mon, 01 Mar 2021 02:32:23 GMT
content-type: application/json
content-length: 622
access-control-allow-origin: *
access-control-allow-credentials: true
x-envoy-upstream-service-time: 55
{
"args": {
},
"headers": {
"Content-Length": "0",
"Host": "httpbin:8000",
"User-Agent": "fortio.org/fortio-1.11.3",
"X-B3-Parentspanid": "782cd308639c0f00",
"X-B3-Sampled": "0",
"X-B3-Spanid": "adbda1f9940d1821",
"X-B3-Traceid": "4ee996565afd9133782cd308639c0f00",
"X-Envoy-Attempt-Count": "1",
"X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/httpbin;Hash=44b5c92b9b7af426d81bc6e05b6c9a4819037d54acfdf890c5220619a0c0a869;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/default"
},
"origin": "127.0.0.1",
"url": "http://httpbin:8000/get"
}
可以看到,调用后端服务的请求已经成功。接下来可以测试熔断。
- 触发熔断器:
在 DestinationRule
配置中,已经定义了 maxConnections: 1
和 http1MaxPendingRequests: 1
。这些规则意味着,如果并发的连接和请求数超过 1 个,在 istio-proxy
进行进一步的请求和连接时,后续的请求或连接将被阻止。
- 发送并发数为 2 的连接(
-c 2
),请求 20 次(-n 20
):
kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel Warning http://httpbin:8000/get
02:42:27 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 2->2 procs, for 20 calls: http://httpbin:8000/get
Starting at max qps with 2 thread(s) [gomax 2] for exactly 20 calls (10 per thread + 0)
02:42:27 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 72.128739ms : 20 calls. qps=277.28
Aggregated Function Time : count 20 avg 0.0064766836 +/- 0.003552 min 0.003330814 max 0.015536857 sum 0.129533672
# range, mid point, percentile, count
>= 0.00333081 <= 0.004 , 0.00366541 , 30.00, 6
> 0.004 <= 0.005 , 0.0045 , 50.00, 4
> 0.005 <= 0.006 , 0.0055 , 65.00, 3
> 0.006 <= 0.007 , 0.0065 , 75.00, 2
> 0.007 <= 0.008 , 0.0075 , 80.00, 1
> 0.008 <= 0.009 , 0.0085 , 85.00, 1
> 0.012 <= 0.014 , 0.013 , 95.00, 2
> 0.014 <= 0.0155369 , 0.0147684 , 100.00, 1
# target 50% 0.005
# target 75% 0.007
# target 90% 0.013
# target 99% 0.0152295
# target 99.9% 0.0155061
Sockets used: 3 (for perfect keepalive, would be 2)
Jitter: false
Code 200 : 19 (95.0 %)
Code 503 : 1 (5.0 %)
Response Header Sizes : count 20 avg 218.55 +/- 50.14 min 0 max 231 sum 4371
Response Body/Total Sizes : count 20 avg 821.5 +/- 133.2 min 241 max 853 sum 16430
All done 20 calls (plus 0 warmup) 6.477 ms avg, 277.3 qps
可以看到,几乎所有的请求都完成了。istio-proxy
允许存在一些误差。
- 将并发连接数提高到 3 个(
-c 3
),请求 30 次(-n 30
):
kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -c 3 -qps 0 -n 30 -loglevel Warning http://httpbin:8000/get
02:46:45 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 2->2 procs, for 30 calls: http://httpbin:8000/get
Starting at max qps with 3 thread(s) [gomax 2] for exactly 30 calls (10 per thread + 0)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 49.708473ms : 30 calls. qps=603.52
Aggregated Function Time : count 30 avg 0.0042163392 +/- 0.004712 min 0.000489901 max 0.014537558 sum 0.126490177
# range, mid point, percentile, count
>= 0.000489901 <= 0.001 , 0.000744951 , 36.67, 11
> 0.001 <= 0.002 , 0.0015 , 56.67, 6
> 0.002 <= 0.003 , 0.0025 , 66.67, 3
> 0.004 <= 0.005 , 0.0045 , 70.00, 1
> 0.006 <= 0.007 , 0.0065 , 73.33, 1
> 0.009 <= 0.01 , 0.0095 , 80.00, 2
> 0.01 <= 0.011 , 0.0105 , 86.67, 2
> 0.011 <= 0.012 , 0.0115 , 93.33, 2
> 0.014 <= 0.0145376 , 0.0142688 , 100.00, 2
# target 50% 0.00166667
# target 75% 0.00925
# target 90% 0.0115
# target 99% 0.0144569
# target 99.9% 0.0145295
Sockets used: 24 (for perfect keepalive, would be 3)
Jitter: false
Code 200 : 8 (26.7 %)
Code 503 : 22 (73.3 %)
Response Header Sizes : count 30 avg 61.433333 +/- 101.9 min 0 max 231 sum 1843
Response Body/Total Sizes : count 30 avg 404.03333 +/- 270.4 min 241 max 853 sum 12121
All done 30 calls (plus 0 warmup) 4.216 ms avg, 603.5 qps
可以看到预期的熔断行为,只有 26.7 % 的请求成功,其余的均被熔断器拦截:
Code 200 : 8 (26.7 %)
Code 503 : 22 (73.3 %)
- 通过
istio-proxy
状态查看熔断详情:
kubectl exec $FORTIO_POD -c istio-proxy -- pilot-agent request GET stats | grep httpbin | grep pending
cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.default.rq_pending_open: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.high.rq_pending_open: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_overflow: 23
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_total: 28
可以看到 upstream_rq_pending_overflow
值是 23,这意味着,目前为止已有 23 个调用被标记为熔断。
- 修改熔断器:
修改 DestinationRule
,配置为 maxConnections: 3
和 http1MaxPendingRequests: 3
。
vim samples/httpbin/networking/destination-rule-httpbin.yaml
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: httpbin
spec:
host: httpbin
trafficPolicy:
connectionPool:
tcp:
maxConnections: 3
http:
http1MaxPendingRequests: 3
maxRequestsPerConnection: 1
outlierDetection:
consecutiveErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100
kubectl apply -f samples/httpbin/networking/destination-rule-httpbin.yaml
kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -c 3 -qps 0 -n 30 -loglevel Warning http://httpbin:8000/get
03:13:34 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 2->2 procs, for 30 calls: http://httpbin:8000/get
Starting at max qps with 3 thread(s) [gomax 2] for exactly 30 calls (10 per thread + 0)
Ended after 143.869637ms : 30 calls. qps=208.52
Aggregated Function Time : count 30 avg 0.014009803 +/- 0.01431 min 0.005262414 max 0.05379173 sum 0.42029408
# range, mid point, percentile, count
>= 0.00526241 <= 0.006 , 0.00563121 , 23.33, 7
> 0.006 <= 0.007 , 0.0065 , 40.00, 5
> 0.007 <= 0.008 , 0.0075 , 56.67, 5
> 0.008 <= 0.009 , 0.0085 , 63.33, 2
> 0.009 <= 0.01 , 0.0095 , 66.67, 1
> 0.01 <= 0.011 , 0.0105 , 73.33, 2
> 0.012 <= 0.014 , 0.013 , 80.00, 2
> 0.025 <= 0.03 , 0.0275 , 90.00, 3
> 0.05 <= 0.0537917 , 0.0518959 , 100.00, 3
# target 50% 0.0076
# target 75% 0.0125
# target 90% 0.03
# target 99% 0.0534126
# target 99.9% 0.0537538
Sockets used: 3 (for perfect keepalive, would be 3)
Jitter: false
Code 200 : 30 (100.0 %)
Response Header Sizes : count 30 avg 230.26667 +/- 0.4422 min 230 max 231 sum 6908
Response Body/Total Sizes : count 30 avg 852.26667 +/- 0.4422 min 852 max 853 sum 25568
All done 30 calls (plus 0 warmup) 14.010 ms avg, 208.5 qps
此时可以看到,100% 的请求成功,所有的请求均未被熔断:
Code 200 : 30 (100.0 %)