文章列表
写在前面:
我是「境里婆娑」。我还是从前那个少年,没有一丝丝改变,时间只不过是考验,种在心中信念丝毫未减,眼前这个少年,还是最初那张脸,面前再多艰险不退却。
写博客的目的就是分享给大家一起学习交流,如果您对 Java感兴趣,可以关注我,我们一起学习。
前言:在工作中可能会遇到SpringBatch读取的文件记录跨多行或者文件中存在多种不同的记录格式,不必担心SpringBatch已经帮我们把接口都预留好了,只需要稍微改造就可以轻松实现。
读记录跨多行文件
当Flat文件格式非标准是,通过实现记录分隔策略接口RecordSeparatorPolicy来实现非标准Flat格式文件。非标准Flat文件有多种情况,例如记录跨多行、以特定的字符开头、以特定的字符结尾。
下面讲的例子是每两行表示一条记录:
412222,201,tom,2020-02-27
,china
412453,203,tqm,2020-03-27
,us
412222,205,tym,2020-05-27
,jap
默认的记录分割策略SimpleRecordSeparatorPolicy或者DefaultRecordSeparatorPolicy已经不能处理此类文件。我们可以实现接口RecordSeparatorPolicy来自定义分割策略MulitiLineRecordSeparatorPolicy
读记录跨多行文件时,使用到的核心组件类图如下:
在本类图中除了MulitiLineRecordSeparatorPolicy和CommonFieldSetMapper是自定义实现的,其他组件都是SpringBatch自带。
MulitiLineRecordSeparatorPolicy:负责从文件中确认一条完整记录,在本实现中每读到四个逗号分隔符,则认为是一条完整的记录
/**
* @author shuliangzhao
* @date 2020/12/6 13:05
*/
public class MulitiLineRecordSeparatorPolicy implements RecordSeparatorPolicy {
private String delimiter = ",";
private int count = 0;
public int getCount() {
return count;
}
public void setCount(int count) {
this.count = count;
}
public String getDelimiter() {
return delimiter;
}
public void setDelimiter(String delimiter) {
this.delimiter = delimiter;
}
@Override
public boolean isEndOfRecord(String record) {
return countDelimiter(record) == count;
}
private int countDelimiter(String record) {
String temp = record;
int index = -1;
int count = 0;
while ((index=temp.indexOf(",")) != -1) {
temp = temp.substring(index +1);
count++;
}
return count;
}
@Override
public String postProcess(String record) {
return record;
}
@Override
public String preProcess(String record) {
return record;
}
}
delimiter :定义为读的的分割符号
count:分隔符总数,给定的字符串包含的分隔符个数等于此值,则认为是一条完整的记录。
1、读跨多行文件job配置
读跨多行文件job基于javabean配置如下
/**
* 读记录跨多行文件
* @author shuliangzhao
* @date 2020/12/6 13:38
*/
@Configuration
@EnableBatchProcessing
public class MulitiLineConfiguration {
@Autowired
private JobBuilderFactory jobBuilderFactory;
@Autowired
private StepBuilderFactory stepBuilderFactory;
@Autowired
private PartitonMultiFileProcessor partitonMultiFileProcessor;
@Autowired
private PartitionMultiFileWriter partitionMultiFileWriter;
@Bean
public Job mulitiLineJob() {
return jobBuilderFactory.get("mulitiLineJob").start(mulitiLineStep()).build();
}
@Bean
public Step mulitiLineStep() {
return stepBuilderFactory.get("mulitiLineStep")
.<CreditBill,CreditBill>chunk(12)
.reader(mulitiLineRecordReader())
.processor(partitonMultiFileProcessor)
.writer(partitionMultiFileWriter)
.build();
}
@Bean
@StepScope
public MulitiLineRecordReader mulitiLineRecordReader() {
return new MulitiLineRecordReader(CreditBill.class);
}
}
2、读跨多行文件reader
MulitiLineRecordReader详细如下
/**
* @author shuliangzhao
* @date 2020/12/6 13:09
*/
public class MulitiLineRecordReader extends FlatFileItemReader {
public MulitiLineRecordReader(Class clz) {
setResource(CommonUtil.createResource("D:\\aplus\\muliti\\muliti.csv"));
String[] names = CommonUtil.names(clz);
DefaultLineMapper defaultLineMapper = new DefaultLineMapper();
CommonFieldSetMapper commonFieldSetMapper = new CommonFieldSetMapper();
commonFieldSetMapper.setTargetType(clz);
defaultLineMapper.setFieldSetMapper(commonFieldSetMapper);
DelimitedLineTokenizer delimitedLineTokenizer = new DelimitedLineTokenizer();
delimitedLineTokenizer.setFieldSetFactory(new DefaultFieldSetFactory());
delimitedLineTokenizer.setNames(names);
delimitedLineTokenizer.setDelimiter(",");
defaultLineMapper.setLineTokenizer(delimitedLineTokenizer);
MulitiLineRecordSeparatorPolicy mulitiLineRecordSeparatorPolicy = new MulitiLineRecordSeparatorPolicy();
mulitiLineRecordSeparatorPolicy.setCount(4);
mulitiLineRecordSeparatorPolicy.setDelimiter(",");
setRecordSeparatorPolicy(mulitiLineRecordSeparatorPolicy);
setLineMapper(defaultLineMapper);
}
}
3、自定义FieldSetMapper
自定义CommonFieldSetMapper
**
* @author shuliangzhao
* @date 2020/12/4 22:14
*/
public class CommonFieldSetMapper<T> implements FieldSetMapper<T> {
private Class<? extends T> type;
@Override
public T mapFieldSet(FieldSet fieldSet) throws BindException {
try {
T t = type.newInstance();
Field[] declaredFields = type.getDeclaredFields();
if (declaredFields != null) {
for (Field field : declaredFields) {
field.setAccessible(true);
if (field.getName().equals("id")) {
continue;
}
String name = field.getType().getName();
if (name.equals("java.lang.Integer")) {
field.set(t,fieldSet.readInt(field.getName()));
}else if (name.equals("java.lang.String")) {
field.set(t,fieldSet.readString(field.getName()));
}else if (name.equals("java.util.Date")) {
field.set(t,fieldSet.readDate(field.getName()));
}else{
field.set(t,fieldSet.readString(field.getName()));
}
}
return t;
}
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
public void setTargetType(Class<? extends T> type) {
this.type = type;
}
4、读跨多行文件processor
PartitonMultiFileProcessor 详细如下
@Component
@StepScope
public class PartitonMultiFileProcessor implements ItemProcessor<CreditBill,CreditBill> {
@Override
public CreditBill process(CreditBill item) throws Exception {
CreditBill creditBill = new CreditBill();
creditBill.setAcctid(item.getAcctid());
creditBill.setAddress(item.getAddress());
creditBill.setAmout(item.getAmout());
creditBill.setDate(item.getDate());
creditBill.setName(item.getName());
return creditBill;
}
}
5、读跨多行文件writer
PartitionMultiFileWriter详细如下
@Component
@StepScope
public class PartitionMultiFileWriter implements ItemWriter<CreditBill> {
@Autowired
private CreditBillMapper creditBillMapper;
@Override
public void write(List<? extends CreditBill> items) throws Exception {
if (items != null && items.size() > 0) {
items.stream().forEach(item -> {
creditBillMapper.insert(item);
});
}
}
}
至此,我们完成了对文件分区的处理。
如果向更详细查看以上所有代码请移步到github:读跨多行文件详细代码