一、背景
在工作中,经常需要处理excel文件,将提供的excel文件导入数据库,有时候文件太大我们就需要做拆分了,下面实现将xlsx文件和xls文件按行数拆分,每个文件都保留标题行。这篇文件接着拆分csv大文件进行说明。
二、后端代码
1、我们需要引入poi的依赖
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>4.0.1</version> </dependency> <!-- 07版本以后的格式 --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>4.0.1</version> </dependency>
2、拆分xls类型的文件方法
(1)参数inputStream为要拆分的文件流
(2)参数fileName为拆分文件名
(3)参数splitSize为拆分后每个文件的最大行数
(4)思路:根据前端传的excel文件流转换为一个Workbook文件,创建Workbook文件目的就是能够更好的操作excel,然后获取总行数,根据传到splitSize计算出需要创建几个excel文件,然后创建需要的若干个空excel放入list集合中,遍历原始excel,遍历时有行,单元格的概念,这个poi都能够操作,原始文件第一行的所有单元格都遍历到每个拆分文件的第一行,其余行随机往拆分文件里面一行一行的写,写到最后,将拆分文件集合中的Workbook文件写入存放拆分文件的目录中,关流。
/** * 拆分xls文件并返回文件夹路径 * @param inputStream * @param fileName * @param splitSize * @return */ public static String getXlsZipPath(InputStream inputStream, String fileName, int splitSize) { StopWatch stopWatch = new StopWatch(); stopWatch.start(); Workbook workBook = null; try { workBook = WorkbookFactory.create(inputStream); } catch (Exception e) { logger.error("Load excel file error!", e); return null; } try { Sheet sheet = workBook.getSheetAt(0); Iterator<Row> rowIterator = sheet.rowIterator(); int totalRows = sheet.getPhysicalNumberOfRows(); int splitNumber = (int) ((totalRows % splitSize == 0) ? (totalRows / splitSize) : (totalRows / splitSize + 1)); logger.info("xls文件总行数: {}行 拆分文件个数:{}个", totalRows, splitNumber); List<HSSFWorkbook> hssfWorkbooks = new ArrayList<>(); //创建存放拆分文件的目录,文件夹存在,可能里面有内容,删除所有内容 File dir = new File(defaultDir + fileName); if (dir.exists()) { delAllFile(dir.getAbsolutePath()); } dir.mkdirs(); //创建的拆分文件写入流并放入集合中 for (int count = 0; count < splitNumber; count++) { HSSFWorkbook hssfWorkbook = new HSSFWorkbook(); hssfWorkbook.createSheet().createRow(0); hssfWorkbooks.add(hssfWorkbook); } int oldRow = 0; while (rowIterator.hasNext()) { Row row = rowIterator.next(); if (row.getRowNum() == 0) { oldRow++; Iterator<Cell> cellIterator = row.cellIterator(); int index = 0; while (cellIterator.hasNext()) { Cell next = cellIterator.next(); // 解决获取excel数据的方法与实际类型不符 next.setCellType(CellType.STRING); String value = next.getStringCellValue(); for (int count = 0; count < splitNumber; count++) { HSSFRow row1 = hssfWorkbooks.get(count).getSheetAt(0).getRow(0); row1.createCell(index).setCellValue(value); } index++; } } if (row.getRowNum() != 0) { Iterator<Cell> cellIterator = row.cellIterator(); HSSFWorkbook hssfWorkbook = hssfWorkbooks.get(oldRow % splitNumber); HSSFSheet sheet1 = hssfWorkbook.getSheetAt(0); int index = 0; HSSFRow row1 = sheet1.createRow(sheet1.getLastRowNum() + 1); while (cellIterator.hasNext()) { Cell next = cellIterator.next(); next.setCellType(CellType.STRING); String value = next.getStringCellValue(); row1.createCell(index++).setCellValue(value); } oldRow++; } } //创建拆分文件并关流 for (int i = 0; i < hssfWorkbooks.size(); i++) { String splitFilePath = defaultDir + fileName + File.separator + fileName + i + ".xlsx"; File file = new File(splitFilePath); file.createNewFile(); FileOutputStream outputStream = new FileOutputStream(file); hssfWorkbooks.get(i).write(outputStream); hssfWorkbooks.get(i).close(); } } catch (IOException e) { logger.error("拆分xls文件失败 :" + e); } stopWatch.stop(); logger.info("xls文件拆分共花费: " + stopWatch.getTotalTimeMillis() + " ms"); return defaultDir + fileName + File.separator; }
3、拆分xlsx类型的文件方法
(1)参数inputStream为要拆分的文件流
(2)参数fileName为拆分文件名
(3)参数splitSize为拆分后每个文件的最大行数
(4)思路:根据前端传的excel文件流转换为一个Workbook文件,创建Workbook文件目的就是能够更好的操作excel,然后获取总行数,根据传到splitSize计算出需要创建几个excel文件,然后创建需要的若干个空excel放入list集合中,遍历原始excel,遍历时有行,单元格的概念,这个poi都能够操作,原始文件第一行的所有单元格都遍历到每个拆分文件的第一行,其余行随机往拆分文件里面一行一行的写,写到最后,将拆分文件集合中的Workbook文件写入存放拆分文件的目录中,关流。
/** * 拆分xlsx文件并返回文件夹路径 * * @param inputStream * @param fileName * @param splitSize * @return */ public static String getXlsxZipPath(InputStream inputStream, String fileName, int splitSize) { StopWatch stopWatch = new StopWatch(); stopWatch.start(); Workbook workBook = null; try { workBook = WorkbookFactory.create(inputStream); } catch (Exception e) { logger.error("Load excel file error!", e); return null; } try { Sheet sheet = workBook.getSheetAt(0); Iterator<Row> rowIterator = sheet.rowIterator(); int totalRows = sheet.getPhysicalNumberOfRows(); int splitNumber = (int) ((totalRows % splitSize == 0) ? (totalRows / splitSize) : (totalRows / splitSize + 1)); logger.info("xlsx文件总行数: {}行 拆分文件个数:{}个", totalRows, splitNumber); List<XSSFWorkbook> xssfWorkbooks = new ArrayList<>(); //创建存放拆分文件的目录,文件夹存在,可能里面有内容,删除所有内容 File dir = new File(defaultDir + fileName); if (dir.exists()) { delAllFile(dir.getAbsolutePath()); } dir.mkdirs(); //创建的拆分文件写入流并放入集合中 for (int count = 0; count < splitNumber; count++) { XSSFWorkbook xssfWorkbook = new XSSFWorkbook(); xssfWorkbook.createSheet().createRow(0); xssfWorkbooks.add(xssfWorkbook); } int oldRow = 0; while (rowIterator.hasNext()) { Row row = rowIterator.next(); if (row.getRowNum() == 0) { oldRow++; Iterator<Cell> cellIterator = row.cellIterator(); int index = 0; while (cellIterator.hasNext()) { Cell next = cellIterator.next(); next.setCellType(CellType.STRING); String value = next.getStringCellValue(); for (int count = 0; count < splitNumber; count++) { XSSFRow row1 = xssfWorkbooks.get(count).getSheetAt(0).getRow(0); row1.createCell(index).setCellValue(value); } index++; } } if (row.getRowNum() != 0) { Iterator<Cell> cellIterator = row.cellIterator(); XSSFWorkbook xssfWorkbook = xssfWorkbooks.get(oldRow % splitNumber); XSSFSheet sheet1 = xssfWorkbook.getSheetAt(0); int index = 0; XSSFRow row1 = sheet1.createRow(sheet1.getLastRowNum() + 1); while (cellIterator.hasNext()) { Cell next = cellIterator.next(); next.setCellType(CellType.STRING); String value = next.getStringCellValue(); row1.createCell(index++).setCellValue(value); } oldRow++; } } //创建拆分文件并关流 for (int i = 0; i < xssfWorkbooks.size(); i++) { String splitFilePath = defaultDir + fileName + File.separator + fileName + i + ".xlsx"; File file = new File(splitFilePath); file.createNewFile(); FileOutputStream outputStream = new FileOutputStream(file); xssfWorkbooks.get(i).write(outputStream); xssfWorkbooks.get(i).close(); } } catch (IOException e) { logger.error("拆分xlsx文件失败 :" + e); } stopWatch.stop(); logger.info("xlsx文件拆分共花费: " + stopWatch.getTotalTimeMillis() + " ms"); return defaultDir + fileName + File.separator; }
4、这两种方法具体用哪一个需要在使用之前判断传过来的文件类型。
5、上面需要用到的删除文件方法如下
/*** * 删除文件夹 * */ public static void delFolder(String folderPath) { try { delAllFile(folderPath); // 删除完里面所有内容 String filePath = folderPath; filePath = filePath.toString(); File myFilePath = new File(filePath); myFilePath.delete(); // 删除空文件夹 } catch (Exception e) { e.printStackTrace(); } } /*** * 删除指定文件夹下所有文件 * * @param path 文件夹完整绝对路径 * @return */ public static boolean delAllFile(String path) { boolean flag = false; File file = new File(path); if (!file.exists()) { return flag; } if (!file.isDirectory()) { return flag; } String[] tempList = file.list(); File temp = null; for (int i = 0; i < tempList.length; i++) { if (path.endsWith(File.separator)) { temp = new File(path + tempList[i]); } else { temp = new File(path + File.separator + tempList[i]); } if (temp.isFile()) { temp.delete(); } if (temp.isDirectory()) { delAllFile(path + "/" + tempList[i]);// 先删除文件夹里面的文件 delFolder(path + "/" + tempList[i]);// 再删除空文件夹 flag = true; } } return flag; }