SpringCloudFeign底层是通过http/https协议进行通信,默认是采用java.net.HttpURLConnection
,每次请求都会建立、关闭连接,为了性能考虑,可以引入httpclient、okhttp作为底层的通信框架。
maven坐标如下:
<dependency>
<groupId>io.github.openfeign</groupId>
<artifactId>feign-httpclient</artifactId>
<version>9.5.0</version>
</dependency>
在项目中我们采用SpringCloud(Dalston.SR4)技术栈,为了性能考虑也引入了httpclient对feign的支持。当一次项目压测过程中,发现请求的tps达到一定值时,整个请求的耗时明显上升了,后来通过日志分析,发现请求调用发起时到服务端接受请求耗费了5s以上,最后聚焦到feign调用的底层实现(中间也走了很多弯路,一度怀疑网络层面是否存在问题),通过代码分析,发现feign初始化client的代码如下:
@Configuration
@ConditionalOnClass(ApacheHttpClient.class)
@ConditionalOnProperty(value = "feign.httpclient.enabled", matchIfMissing = true)
class HttpClientFeignLoadBalancedConfiguration {
@Autowired(required = false)
private HttpClient httpClient;
@Bean
@ConditionalOnMissingBean(Client.class)
public Client feignClient(CachingSpringLoadBalancerFactory cachingFactory,
SpringClientFactory clientFactory) {
ApacheHttpClient delegate;
if (this.httpClient != null) {
delegate = new ApacheHttpClient(this.httpClient);
} else {
delegate = new ApacheHttpClient();
}
return new LoadBalancerFeignClient(delegate, cachingFactory, clientFactory);
}
}
项目中我们没有显示的声明org.apache.http.client.HttpClient
,所以走到了delegate = new ApacheHttpClient();
这段逻辑,继续往下分析,找到org.apache.http.impl.conn.PoolingHttpClientConnectionManager
初始化方法:
public PoolingHttpClientConnectionManager(
final HttpClientConnectionOperator httpClientConnectionOperator,
final HttpConnectionFactory<HttpRoute, ManagedHttpClientConnection> connFactory,
final long timeToLive, final TimeUnit tunit) {
super();
this.configData = new ConfigData();
this.pool = new CPool(new InternalConnectionFactory(
this.configData, connFactory), 2, 20, timeToLive, tunit);
this.pool.setValidateAfterInactivity(2000);
this.connectionOperator = Args.notNull(httpClientConnectionOperator, "HttpClientConnectionOperator");
this.isShutDown = new AtomicBoolean(false);
}
public CPool(
final ConnFactory<HttpRoute, ManagedHttpClientConnection> connFactory,
final int defaultMaxPerRoute, final int maxTotal,
final long timeToLive, final TimeUnit tunit) {
super(connFactory, defaultMaxPerRoute, maxTotal);
this.timeToLive = timeToLive;
this.tunit = tunit;
}
到这里一眼就能看出问题了,默认的defaultMaxPerRoute=2,maxTotal=20,所以根本原因已找到,解决方法就是不用默认构造的org.apache.http.client.HttpClient
,在应用中自己申明一个HttpClient实例bean:
@Bean(destroyMethod = "close")
public CloseableHttpClient httpClient() {
PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setMaxTotal(400);
connectionManager.setDefaultMaxPerRoute(100);
RequestConfig requestConfig = RequestConfig.custom().setConnectionRequestTimeout(2000)//从连接池获取连接等待超时时间
.setConnectTimeout(2000)//请求超时时间
.setSocketTimeout(15000)//等待服务响应超时时间
.build();
HttpClientBuilder httpClientBuilder = HttpClientBuilder.create().setConnectionManager(connectionManager)
.setDefaultRequestConfig(requestConfig)
//自定义重试策略,针对502和503重试一次
.setServiceUnavailableRetryStrategy(new CustomizedServiceUnavailableRetryStrategy())
.evictExpiredConnections();
return httpClientBuilder.build();
}
至此,问题已经解决。
总结:
引入新技术栈时,一定要阅读相关文档了解组件的配置化参数信息(默认值往往在遇到高并发场景无法满足),特别是对于基于springboot构建的应用,往往由于自动化的配置,导致忽略了重要参数的指定。