Apache Kafka-通过concurrency实现并发消费

文章目录


Apache Kafka-通过concurrency实现并发消费


概述

默认情况下, Spring-Kafka @KafkaListener 串行消费的。缺点显而易见生产者生产的数据过多时,消费端容易导致消息积压的问题。

当然了, 我们可以通过启动多个进程,实现 多进程的并发消费。 当然了也取决于你的TOPIC的 partition的数量。

试想一下, 在单进程的情况下,能否实现多线程的并发消费呢? Spring Kafka 为我们提供了这个功能,而且使用起来相当简单。 重点是把握原理,灵活运用。

@KafkaListener 的 concurrecy属性 可以指定并发消费的线程数 。

Apache Kafka-通过concurrency实现并发消费

举个例子 : 如果设置 concurrency=2 时,Spring-Kafka 就会为该 @KafkaListener标注的方法消费的消息 创建 2个线程,进行并发消费。 当然了,这是有前置条件的。 不要超过 partitions 的大小

  • 当concurrency < partition 的数量,会出现消费不均的情况,一个消费者的线程可能消费多个partition 的数据

  • 当concurrency = partition 的数量,最佳状态,一个消费者的线程消费一个 partition 的数据

  • 当concurrency > partition 的数量,会出现有的消费者的线程没有可消费的partition, 造成资源的浪费


演示过程

Apache Kafka-通过concurrency实现并发消费

  1. 创建一个 Topic 为 “RRRR” ,并且设置其 Partition 分区数为 2
  2. 创建一个 ArtisanCosumerMock类,并在其消费方法上,添加 @KafkaListener(concurrency=2) 注解
  3. 启动单元测试, Spring Kafka会根据@KafkaListener(concurrency=2) ,创建2个kafka consumer . ( 是两个Kafka Consumer ) . 然后,每个kafka Consumer 会被单独分配到一个线程中pull 消息, 消费消息
  4. 之后,Kafka Broker将Topic RRRR 分配给创建的 2个 Kafka Consumer 各 1个Partition (一共就2个partition,最佳情况,一人一个)

Apache Kafka-通过concurrency实现并发消费

总结下: @KafkaListener(concurrency=2) 创建两个Kafka Consumer , 就在各自的线程中,拉取各自的Topic RRRR的 分区Partition 消息, 各自串行消费,从而实现单进程的多线程的并发消费。

题外话:

RocketMQ 的并发消费,只要创建一个 RocketMQ Consumer 对象,然后 Consumer 拉取完消息之后,丢到 Consumer 的线程池中执行消费,从而实现并发消费。

Spring-Kafka 提供的并发消费,需要创建多个 Kafka Consumer 对象,并且每个 Consumer 都单独分配一个线程,然后 Consumer 拉取完消息之后,在各自的线程中执行消费。


Code

Apache Kafka-通过concurrency实现并发消费

POM依赖

	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>

		<!-- 引入 Spring-Kafka 依赖 -->
		<dependency>
			<groupId>org.springframework.kafka</groupId>
			<artifactId>spring-kafka</artifactId>
		</dependency>

		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
		</dependency>
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<scope>test</scope>
		</dependency>
	</dependencies>


配置文件

 spring:
  # Kafka 配置项,对应 KafkaProperties 配置类
  kafka:
    bootstrap-servers: 192.168.126.140:9092 # 指定 Kafka Broker 地址,可以设置多个,以逗号分隔
    # Kafka Producer 配置项
    producer:
      acks: 1 # 0-不应答。1-leader 应答。all-所有 leader 和 follower 应答。
      retries: 3 # 发送失败时,重试发送的次数
      key-serializer: org.apache.kafka.common.serialization.StringSerializer # 消息的 key 的序列化
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer # 消息的 value 的序列化

    # Kafka Consumer 配置项
    consumer:
      auto-offset-reset: earliest # 设置消费者分组最初的消费进度为 earliest
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
      properties:
        spring:
          json:
            trusted:
              packages: com.artisan.springkafka.domain
    # Kafka Consumer Listener 监听器配置
    listener:
      missing-topics-fatal: false # 消费监听接口监听的主题不存在时,默认会报错。所以通过设置为 false ,解决报错

logging:
  level:
    org:
      springframework:
        kafka: ERROR # spring-kafka
      apache:
        kafka: ERROR # kafka

 


生产者

 package com.artisan.springkafka.producer;

import com.artisan.springkafka.constants.TOPIC;
import com.artisan.springkafka.domain.MessageMock;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.support.SendResult;
import org.springframework.stereotype.Component;
import org.springframework.util.concurrent.ListenableFuture;

import java.util.Random;
import java.util.concurrent.ExecutionException;

/**
 * @author 小工匠
 * @version 1.0
 * @description: TODO
 * @date 2021/2/17 22:25
 * @mark: show me the code , change the world
 */

@Component
public class ArtisanProducerMock {


    @Autowired
    private KafkaTemplate<Object,Object> kafkaTemplate ;


    /**
     * 同步发送
     * @return
     * @throws ExecutionException
     * @throws InterruptedException
     */
    public SendResult sendMsgSync() throws ExecutionException, InterruptedException {
        // 模拟发送的消息
        Integer id = new Random().nextInt(100);
        MessageMock messageMock = new MessageMock(id,"artisanTestMessage-" + id);
        // 同步等待
        return  kafkaTemplate.send(TOPIC.TOPIC, messageMock).get();
    }



}
    
     

消费者

 package com.artisan.springkafka.consumer;

import com.artisan.springkafka.domain.MessageMock;
import com.artisan.springkafka.constants.TOPIC;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.stereotype.Component;

/**
 * @author 小工匠
 * @version 1.0
 * @description: TODO
 * @date 2021/2/17 22:33
 * @mark: show me the code , change the world
 */

@Component
public class ArtisanCosumerMock {


    private Logger logger = LoggerFactory.getLogger(getClass());
    private static final String CONSUMER_GROUP_PREFIX = "MOCK-A" ;

    @KafkaListener(topics = TOPIC.TOPIC ,groupId = CONSUMER_GROUP_PREFIX + TOPIC.TOPIC,
        concurrency = "2")
    public void onMessage(MessageMock messageMock){
        logger.info("【接受到消息][线程ID:{} 消息内容:{}]", Thread.currentThread().getId(), messageMock);
    }

}
    
    

@KafkaListener 注解上,添加了 concurrency = "2" 属性,创建 2 个线程消费 Topic = “RRRR” 下的消息。


单元测试

 
    package com.artisan.springkafka.produceTest;

import com.artisan.springkafka.SpringkafkaApplication;
import com.artisan.springkafka.producer.ArtisanProducerMock;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.kafka.support.SendResult;
import org.springframework.test.context.junit4.SpringRunner;
import org.springframework.util.concurrent.ListenableFuture;
import org.springframework.util.concurrent.ListenableFutureCallback;

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;

/**
 * @author 小工匠
 * * @version 1.0
 * @description: TODO
 * @date 2021/2/17 22:40
 * @mark: show me the code , change the world
 */

@RunWith(SpringRunner.class)
@SpringBootTest(classes = SpringkafkaApplication.class)
public class ProduceMockTest {

    private Logger logger = LoggerFactory.getLogger(getClass());


    @Autowired
    private ArtisanProducerMock artisanProducerMock;


    @Test
    public void testAsynSend() throws ExecutionException, InterruptedException {
        logger.info("开始发送");

        // 模拟发送多条消息  
        for (int i = 0; i < 10; i++) {
            artisanProducerMock.sendMsgSync();
        }


        // 阻塞等待,保证消费
        new CountDownLatch(1).await();

    }

}
    
    

测试结果

2021-02-18 21:55:35.504  INFO 20456 --- [           main] c.a.s.produceTest.ProduceMockTest        : 开始发送
2021-02-18 21:55:35.852  INFO 20456 --- [ntainer#0-0-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:18 消息内容:MessageMock{id=23, name='artisanTestMessage-23'}]
2021-02-18 21:55:35.852  INFO 20456 --- [ntainer#0-1-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:20 消息内容:MessageMock{id=64, name='artisanTestMessage-64'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-1-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:20 消息内容:MessageMock{id=53, name='artisanTestMessage-53'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-0-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:18 消息内容:MessageMock{id=51, name='artisanTestMessage-51'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-1-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:20 消息内容:MessageMock{id=67, name='artisanTestMessage-67'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-0-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:18 消息内容:MessageMock{id=42, name='artisanTestMessage-42'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-0-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:18 消息内容:MessageMock{id=12, name='artisanTestMessage-12'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-1-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:20 消息内容:MessageMock{id=40, name='artisanTestMessage-40'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-1-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:20 消息内容:MessageMock{id=37, name='artisanTestMessage-37'}]
2021-02-18 21:55:35.859  INFO 20456 --- [ntainer#0-0-C-1] c.a.s.consumer.ArtisanCosumerMock        : 【接受到消息][线程ID:18 消息内容:MessageMock{id=27, name='artisanTestMessage-27'}]

从日志结果来看 两个线程在消费 “TOPIC RRRR” 下的消息。

控制台也看下
Apache Kafka-通过concurrency实现并发消费


紧接着
Apache Kafka-通过concurrency实现并发消费

日志

Apache Kafka-通过concurrency实现并发消费

是不是一目了然 ,只有一个线程消费


方式二

Apache Kafka-通过concurrency实现并发消费
Apache Kafka-通过concurrency实现并发消费

重新测试

Apache Kafka-通过concurrency实现并发消费


@KafkaListener 配置项
   /**
     * @KafkaListener(groupId = "testGroup", topicPartitions = {
     * @TopicPartition(topic = "topic1", partitions = {"0", "1"}),
     * @TopicPartition(topic = "topic2", partitions = "0",
     * partitionOffsets = @PartitionOffset(partition = "1", initialOffset = "100"))
     * },concurrency = "6")
     * //concurrency就是同组下的消费者个数,就是并发消费数,必须小于等于分区总数
     */
/**
 * 监听的 Topic 数组
 * 
 * The topics for this listener.
 * The entries can be 'topic name', 'property-placeholder keys' or 'expressions'.
 * An expression must be resolved to the topic name.
 * This uses group management and Kafka will assign partitions to group members.
 * <p>
 * Mutually exclusive with {@link #topicPattern()} and {@link #topicPartitions()}.
 * @return the topic names or expressions (SpEL) to listen to.
 */
String[] topics() default {};
/**
 * 监听的 Topic 表达式
 * 
 * The topic pattern for this listener. The entries can be 'topic pattern', a
 * 'property-placeholder key' or an 'expression'. The framework will create a
 * container that subscribes to all topics matching the specified pattern to get
 * dynamically assigned partitions. The pattern matching will be performed
 * periodically against topics existing at the time of check. An expression must
 * be resolved to the topic pattern (String or Pattern result types are supported).
 * This uses group management and Kafka will assign partitions to group members.
 * <p>
 * Mutually exclusive with {@link #topics()} and {@link #topicPartitions()}.
 * @return the topic pattern or expression (SpEL).
 * @see org.apache.kafka.clients.CommonClientConfigs#METADATA_MAX_AGE_CONFIG
 */
String topicPattern() default "";
/**
 * @TopicPartition 注解的数组。每个 @TopicPartition 注解,可配置监听的 Topic、队列、消费的开始位置
 * 
 * The topicPartitions for this listener when using manual topic/partition
 * assignment.
 * <p>
 * Mutually exclusive with {@link #topicPattern()} and {@link #topics()}.
 * @return the topic names or expressions (SpEL) to listen to.
 */
TopicPartition[] topicPartitions() default {};

/**
 * 消费者分组
 * Override the {@code group.id} property for the consumer factory with this value
 * for this listener only.
 * <p>SpEL {@code #{...}} and property place holders {@code ${...}} are supported.
 * @return the group id.
 * @since 1.3
 */
String groupId() default "";

/**
 * 使用消费异常处理器 KafkaListenerErrorHandler 的 Bean 名字
 * 
 * Set an {@link org.springframework.kafka.listener.KafkaListenerErrorHandler} bean
 * name to invoke if the listener method throws an exception.
 * @return the error handler.
 * @since 1.3
 */
String errorHandler() default "";

/**
 * 自定义消费者监听器的并发数,这个我们在 TODO 详细解析。
 * 
 * Override the container factory's {@code concurrency} setting for this listener. May
 * be a property placeholder or SpEL expression that evaluates to a {@link Number}, in
 * which case {@link Number#intValue()} is used to obtain the value.
 * <p>SpEL {@code #{...}} and property place holders {@code ${...}} are supported.
 * @return the concurrency.
 * @since 2.2
 */
String concurrency() default "";

/**
 * 是否自动启动监听器。默认情况下,为 true 自动启动。
 *  
 * Set to true or false, to override the default setting in the container factory. May
 * be a property placeholder or SpEL expression that evaluates to a {@link Boolean} or
 * a {@link String}, in which case the {@link Boolean#parseBoolean(String)} is used to
 * obtain the value.
 * <p>SpEL {@code #{...}} and property place holders {@code ${...}} are supported.
 * @return true to auto start, false to not auto start.
 * @since 2.2
 */
String autoStartup() default "";

/**
 * Kafka Consumer 拓展属性。
 * 
 * Kafka consumer properties; they will supersede any properties with the same name
 * defined in the consumer factory (if the consumer factory supports property overrides).
 * <h3>Supported Syntax</h3>
 * <p>The supported syntax for key-value pairs is the same as the
 * syntax defined for entries in a Java
 * {@linkplain java.util.Properties#load(java.io.Reader) properties file}:
 * <ul>
 * <li>{@code key=value}</li>
 * <li>{@code key:value}</li>
 * <li>{@code key value}</li>
 * </ul>
 * {@code group.id} and {@code client.id} are ignored.
 * @return the properties.
 * @since 2.2.4
 * @see org.apache.kafka.clients.consumer.ConsumerConfig
 * @see #groupId()
 * @see #clientIdPrefix()
 */
String[] properties() default {};





/**
 * 唯一标识
 *  
 * The unique identifier of the container managing for this endpoint.
 * <p>If none is specified an auto-generated one is provided.
 * <p>Note: When provided, this value will override the group id property
 * in the consumer factory configuration, unless {@link #idIsGroup()}
 * is set to false.
 * <p>SpEL {@code #{...}} and property place holders {@code ${...}} are supported.
 * @return the {@code id} for the container managing for this endpoint.
 * @see org.springframework.kafka.config.KafkaListenerEndpointRegistry#getListenerContainer(String)
 */
String id() default "";


/**
 * id 唯一标识的前缀
 *  
 * When provided, overrides the client id property in the consumer factory
 * configuration. A suffix ('-n') is added for each container instance to ensure
 * uniqueness when concurrency is used.
 * <p>SpEL {@code #{...}} and property place holders {@code ${...}} are supported.
 * @return the client id prefix.
 * @since 2.1.1
 */
String clientIdPrefix() default "";
/**
 * 当 groupId 未设置时,是否使用 id 作为 groupId
 * 
 * When {@link #groupId() groupId} is not provided, use the {@link #id() id} (if
 * provided) as the {@code group.id} property for the consumer. Set to false, to use
 * the {@code group.id} from the consumer factory.
 * @return false to disable.
 * @since 1.3
 */
boolean idIsGroup() default true;

/**
 * 使用的 KafkaListenerContainerFactory Bean 的名字。
 * 若未设置,则使用默认的 KafkaListenerContainerFactory Bean 。
 * 
 * The bean name of the {@link org.springframework.kafka.config.KafkaListenerContainerFactory}
 * to use to create the message listener container responsible to serve this endpoint.
 * <p>If not specified, the default container factory is used, if any.
 * @return the container factory bean name.
 */
String containerFactory() default "";

/**
 * 所属 MessageListenerContainer Bean 的名字。
 * 
 * If provided, the listener container for this listener will be added to a bean
 * with this value as its name, of type {@code Collection<MessageListenerContainer>}.
 * This allows, for example, iteration over the collection to start/stop a subset
 * of containers.
 * <p>SpEL {@code #{...}} and property place holders {@code ${...}} are supported.
 * @return the bean name for the group.
 */
String containerGroup() default "";

/**
 * 真实监听容器的 Bean 名字,需要在名字前加 "__" 。
 * 
 * A pseudo bean name used in SpEL expressions within this annotation to reference
 * the current bean within which this listener is defined. This allows access to
 * properties and methods within the enclosing bean.
 * Default '__listener'.
 * <p>
 * Example: {@code topics = "#{__listener.topicList}"}.
 * @return the pseudo bean name.
 * @since 2.1.2
 */
String beanRef() default "__listener";

 
分布式下的concurrency

第一个单元测试,不要关闭,我们继续启动单元测试

Apache Kafka-通过concurrency实现并发消费

继续启动, 会发现 当节点数量 = partition的数量的时候, 每个节点 其实还是一个线程去消费,达到最优。


源码地址

https://github.com/yangshangwei/boot2/tree/master/springkafkaConcurrencyConsume

Apache Kafka-通过concurrency实现并发消费

上一篇:MySQL MVCC


下一篇:BUG:@RabbitListener的concurrency属性