POI之SXSSFWorkbook大量数据导出至excel

一:简介
          SXSSFWorkbook是用来生成海量excel数据文件,主要原理是借助临时存储空间生成excel,
          SXSSFWorkbook专门处理大数据,对于大型excel的创建且不会内存溢出的,就只有SXSSFWorkbook了。
          它的原理很简单,用硬盘空间换内存(就像hashmap用空间换时间一样)。 SXSSFWorkbook是streaming
          版本的XSSFWorkbook,它只会保存最新的excel rows在内存里供查看,在此之前的excel rows都会被写入到
          硬盘里(Windows电脑的话,是写入到C盘根目录下的temp文件夹)。被写入到硬盘里的rows是不可见的/不
          可访问的。只有还保存在内存里的才可以被访问到。 
          注:HSSFWorkbook和XSSFWorkbook的Excel Sheet导出条数上限(<=2003版)是65535行、256列,(>=2007版)
               是1048576行,16384列,如果数据量超过了此上限,那么可以使用SXSSFWorkbook来导出。实际上上万条数据,
               甚至上千条数据就可以考虑使用SXSSFWorkbook了。
        注意:首先需要引入依赖:注意:4.0.0版本的JDK需要1.8以上,如果JDK是1.7的,那么就使用3.9版本的依赖

<!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml-schemas -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml-schemas</artifactId>
<version>4.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>4.0.0</version>
</dependency>
二:实例一,我们使用SXSSFWorkbook向Excel中写入50万条数据,只需要         34秒左右,内存占用率最多在700M左右,CPU使用率在25%左右           
        

           代码如下:

package com.test.POI;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.xssf.streaming.SXSSFRow;
import org.apache.poi.xssf.streaming.SXSSFSheet;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class SXSSFWORKBookUtils {

@SuppressWarnings("resource")
public static void main(String[] args) throws FileNotFoundException, InvalidFormatException {
long startTime = System.currentTimeMillis();
String filePath = "E:\\txt\\111.xlsx";
SXSSFWorkbook sxssfWorkbook = null;
BufferedOutputStream outputStream = null;
try {
//这样表示SXSSFWorkbook只会保留100条数据在内存中,其它的数据都会写到磁盘里,这样的话占用的内存就会很少
sxssfWorkbook = new SXSSFWorkbook(getXSSFWorkbook(filePath),100);
//获取第一个Sheet页
SXSSFSheet sheet = sxssfWorkbook.getSheetAt(0);
for (int i = 0; i < 50; i++) {
for (int z = 0; z < 10000; z++) {
SXSSFRow row = sheet.createRow(i*10000+z);
for (int j = 0; j < 10; j++) {
row.createCell(j).setCellValue("你好:"+j);
}
}
}
outputStream = new BufferedOutputStream(new FileOutputStream(filePath));
sxssfWorkbook.write(outputStream);
outputStream.flush();
sxssfWorkbook.dispose();// 释放workbook所占用的所有windows资源
} catch (IOException e) {
e.printStackTrace();
}finally {
if(outputStream!=null) {
try {
outputStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
long endTime = System.currentTimeMillis();
System.out.println(endTime-startTime);
}


/**
* 先创建一个XSSFWorkbook对象
* @param filePath
* @return
*/
public static XSSFWorkbook getXSSFWorkbook(String filePath) {
XSSFWorkbook workbook = null;
BufferedOutputStream outputStream = null;
try {
File fileXlsxPath = new File(filePath);
outputStream = new BufferedOutputStream(new FileOutputStream(fileXlsxPath));
workbook = new XSSFWorkbook();
workbook.createSheet("测试Sheet");
workbook.write(outputStream);
} catch (Exception e) {
e.printStackTrace();
}finally {
if(outputStream!=null) {
try {
outputStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return workbook;
}

}
  效果:
      

三:我们使用XSSFWorkbook常规的方法分批向excel中写入50万条数据,内         存占用率最多在  2.1个G左右(占用了很大的内存),CPU使用率在90%           左右 ,最后内存 溢出了
          
          代码如下:
          

package com.test;

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class POIController {
/**
* 这种方式效率比较低并且特别占用内存,数据量越大越明显
* @param args
* @throws FileNotFoundException
* @throws InvalidFormatException
*/
public static void main(String[] args) throws FileNotFoundException, InvalidFormatException {
long startTime = System.currentTimeMillis();
BufferedOutputStream outPutStream = null;
XSSFWorkbook workbook = null;
FileInputStream inputStream = null;
String filePath = "E:\\txt\\666.xlsx";
try {
workbook = getWorkBook(filePath);
XSSFSheet sheet = workbook.getSheetAt(0);
for (int i = 0; i < 50; i++) {
for (int z = 0; z < 10000; z++) {
XSSFRow row = sheet.createRow(i*10000+z);
for (int j = 0; j < 10; j++) {
row.createCell(j).setCellValue("你好:"+j);
}
}
//每次要获取新的文件流对象,避免将之前写入的数据覆盖掉
outPutStream = new BufferedOutputStream(new FileOutputStream(filePath));
workbook.write(outPutStream);
}
} catch (IOException e) {
e.printStackTrace();
}finally {
if(outPutStream!=null) {
try {
outPutStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if(inputStream!=null) {
try {
inputStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if(workbook!=null) {
try {
workbook.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
long endTime = System.currentTimeMillis();
System.out.println(endTime-startTime);
}
/**
* 先创建一个XSSFWorkbook对象
* @param filePath
* @return
*/
public static XSSFWorkbook getWorkBook(String filePath) {
XSSFWorkbook workbook = null;
try {
File fileXlsxPath = new File(filePath);
BufferedOutputStream outPutStream = new BufferedOutputStream(new FileOutputStream(fileXlsxPath));
workbook = new XSSFWorkbook();
workbook.createSheet("测试");
workbook.write(outPutStream);
} catch (Exception e) {
e.printStackTrace();
}
return workbook;
}

}
效果:
   
        
————————————————
版权声明:本文为CSDN博主「爱上口袋的天空」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/k_520_w/article/details/84404652

上一篇:利用浏览器打开HTML文件


下一篇:CLR寄宿和AppDomain