全局路由,对所以路由服务提供失败重试机制。全局路由,也就是default-filters.
# 打开路由:
spring.cloud.gateway.default-filters[0].name=Retry
#根据默认参数,只重试GET请求,修改参数使POST也支持
spring.cloud.gateway.default-filters[0].args.methods[0]=GET
spring.cloud.gateway.default-filters[0].args.methods[1]=POST
spring cloud gateway的重试,不是加入spring-retry就可以自动重试的,也不是加入一些时间参数就可以重试的。
下面看一下spring的文档:
6.26. The Retry
GatewayFilter
FactoryThe
Retry
GatewayFilter
factory supports the following parameters:
retries
: The number of retries that should be attempted.
statuses
: The HTTP status codes that should be retried, represented by usingorg.springframework.http.HttpStatus
.
methods
: The HTTP methods that should be retried, represented by usingorg.springframework.http.HttpMethod
.
series
: The series of status codes to be retried, represented by usingorg.springframework.http.HttpStatus.Series
.
exceptions
: A list of thrown exceptions that should be retried.
backoff
: The configured exponential backoff for the retries. Retries are performed after a backoff interval offirstBackoff * (factor ^ n)
, wheren
is the iteration. IfmaxBackoff
is configured, the maximum backoff applied is limited tomaxBackoff
. IfbasedOnPreviousValue
is true, the backoff is calculated byusingprevBackoff * factor
.The following defaults are configured for
Retry
filter, if enabled:
retries
: Three times
series
: 5XX series
methods
: GET method
exceptions
: IOException and TimeoutException
backoff
: disabledThe following listing configures a Retry
GatewayFilter
:Example 55. application.yml
spring:
cloud:
gateway:
routes:
- id: retry_test
uri: http://localhost:8080/flakey
predicates:
- Host=*.retry.com
filters:
- name: Retry
args:
retries: 3
statuses: BAD_GATEWAY
methods: GET,POST
backoff:
firstBackoff: 10ms
maxBackoff: 50ms
factor: 2
basedOnPreviousValue: false
上面的配置是官方文档中的举例,是配置到具体某个路由。
路由配置参数在org.springframework.cloud.gateway.config.GatewayProperties
中定义;
路由定义由RouteDefinitionRouteLocator implements RouteLocator, BeanFactoryAware, ApplicationEventPublisherAware
完成;
private Route convertToRoute(RouteDefinition routeDefinition) {
AsyncPredicate<ServerWebExchange> predicate = combinePredicates(routeDefinition);
List<GatewayFilter> gatewayFilters = getFilters(routeDefinition);
return Route.async(routeDefinition).asyncPredicate(predicate)
.replaceFilters(gatewayFilters).build();
}
private List<GatewayFilter> getFilters(RouteDefinition routeDefinition) {
List<GatewayFilter> filters = new ArrayList<>();
// TODO: support option to apply defaults after route specific filters?
if (!this.gatewayProperties.getDefaultFilters().isEmpty()) {
filters.addAll(loadGatewayFilters(routeDefinition.getId(),
new ArrayList<>(this.gatewayProperties.getDefaultFilters())));
}
if (!routeDefinition.getFilters().isEmpty()) {
filters.addAll(loadGatewayFilters(routeDefinition.getId(),
new ArrayList<>(routeDefinition.getFilters())));
}
AnnotationAwareOrderComparator.sort(filters);
return filters;
}
判断是否重试具体的代码:1
ServerWebExchange exchange = context.applicationContext();
if (exceedsMaxIterations(exchange, retryConfig)) {
return false;
}
// 先判断状态码,状态码优先级高于series
HttpStatus statusCode = exchange.getResponse().getStatusCode();
boolean retryableStatusCode = retryConfig.getStatuses()
.contains(statusCode);
// null status code might mean a network exception?
// 状态码不存在重试,在判断series
if (!retryableStatusCode && statusCode != null) {
// try the series
retryableStatusCode = false;
for (int i = 0; i < retryConfig.getSeries().size(); i++) {
if (statusCode.series().equals(retryConfig.getSeries().get(i))) {
retryableStatusCode = true;
break;
}
}
}
final boolean finalRetryableStatusCode = retryableStatusCode;
trace("retryableStatusCode: %b, statusCode %s, configured statuses %s, configured series %s",
() -> finalRetryableStatusCode, () -> statusCode,
retryConfig::getStatuses, retryConfig::getSeries);
// 判断http的method是否需要重试
HttpMethod httpMethod = exchange.getRequest().getMethod();
boolean retryableMethod = retryConfig.getMethods().contains(httpMethod);
trace("retryableMethod: %b, httpMethod %s, configured methods %s",
() -> retryableMethod, () -> httpMethod, retryConfig::getMethods);
// 最终需要状态码与请求方法两个都满足才会重试
return retryableMethod && finalRetryableStatusCode;
默认参数series
是SERVER_ERROR
服务端错误,所以再服务端重试,包括所有5xx
错误。如果针对个别的4xx
错误,再增加statues参
数。
补充:我为什么倒腾retry
之前使用的eureka,服务上下线的时候也没有配置失败重试,一个服务上下线等一些列延迟,会导致网关向外提供接口存在访问失败的情况。
后来换成nacos,服务上下线的速度快了,再调整一些列超时和刷新时间参数,倒是可以减少上下线过程中api访问失败情况出现的时间范围(1秒~10秒)
# 减小间隔,快速更新网关的服务列表,维持列表是最新
# 但是刷新间隔的减小,频繁线程休眠与唤醒,效率肯定是不好滴
ribbon.ServerListRefreshInterval=1000
还查阅了其他ribbon参数,尝试加入重试机制,发现嗯。。。设置的参数可能不对吧,反正没有重试。
hystrix.command.default.execution.timeout.enabled=true
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=25000
ribbon.ReadTimeout=20000
ribbon.ConnectTimeout=5000
ribbon.MaxAutoRetries=1
ribbon.MaxAutoRetriesNextServer=1
但是服务下线时有那么一段时间既不存在连接超时也不存在读超时,而且连接被拒绝;
另外如果服务不是优雅的下线,比如kill -9,网关会出现如下错误
2021-09-07 11:52:51,446 [reactor-http-epoll-1] TRACE o.s.c.g.f.LoadBalancerClientFilter - LoadBalancerClientFilter url before: lb://xxx/xx-api/ext/wanyee/msg/list?pageNo=30&pageSize=2&beginTime=2020-09-07%2000:00:00&endTime=2021-09-07%2023:59:59&keyword=&suid=&uid=
2021-09-07 11:52:51,446 [reactor-http-epoll-1] TRACE o.s.c.g.f.LoadBalancerClientFilter - LoadBalancerClientFilter url chosen: http://192.168.2.1:9898/xx-api/list?pageNo=30&pageSize=2&beginTime=2020-09-07%2000:00:00&endTime=2021-09-07%2023:59:59&keyword=&suid=&uid=
2021-09-07 11:52:51,450 [reactor-http-epoll-1] ERROR o.s.b.a.w.r.e.AbstractErrorWebExceptionHandler - [dc72bc7f-127] 500 Server Error for HTTP GET "/xx-api/list?pageNo=30&pageSize=2&beginTime=2020-09-07%2000:00:00&endTime=2021-09-07%2023:59:59&keyword=&suid=&uid="
io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: 拒绝连接: /192.168.2.56:9198
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
|_ checkpoint ⇢ org.springframework.cloud.gateway.filter.WeightCalculatorWebFilter [DefaultWebFilterChain]
|_ checkpoint ⇢ org.springframework.boot.actuate.metrics.web.reactive.server.MetricsWebFilter [DefaultWebFilterChain]
|_ checkpoint ⇢ HTTP GET "/ams-api/ext/wanyee/msg/list?pageNo=30&pageSize=2&beginTime=2020-09-07%2000:00:00&endTime=2021-09-07%2023:59:59&keyword=&suid=&uid=" [ExceptionHandlingWebHandler]
Stack trace:
Caused by: java.net.ConnectException: finishConnect(..) failed: 拒绝连接
at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
at io.netty.channel.unix.Socket.finishConnect(Socket.java:251)
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:673)
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:650)
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:530)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
2021-09-07 11:52:51,452 [reactor-http-epoll-1] TRACE o.s.c.g.f.GatewayMetricsFilter - gateway.requests tags: [tag(httpMethod=GET),tag(httpStatusCode=500),tag(outcome=SERVER_ERROR),tag(routeId=ams-api),tag(routeUri=lb://ams),tag(status=INTERNAL_SERVER_ERROR)]
网上搜索错误关键字,好吧,没用的
但是,spring肯定会相办法解决的,这个就是重试机制。