Hadoop文件解压缩

Class

org.apache.hadoop.io.compress .CompressionCodecFactory

A factory that will find the correct codec for a given filename.

Method

CompressionCodec getCodec(Path file)

Find the relevant compression codec for the given file based on its filename suffix.

获得这个压缩数据文件採用哪种压缩数据算法。

package Compress;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException; import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.io.compress.CompressionInputStream;
import org.apache.hadoop.mapreduce.Job; /**
* 解压缩
* @author liguodong
*/
public class Decompression { final static String file = "/liguodong/data.gz";
public static void main(String[] args) throws IOException { Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "DeCodec");
//打包执行必须执行的方法
job.setJarByClass(Decompression.class); CompressionCodecFactory codecFactory = new CompressionCodecFactory(conf);
//返回一个解压缩的实例
CompressionCodec codec = codecFactory.getCodec(new Path(file));
//返回被算法解压了的输入流
CompressionInputStream inputStream = codec.createInputStream
(new FileInputStream(new File(file)));
//将输入流文件写出到去除了扩展名的文件
FileOutputStream outputStream = new FileOutputStream
(new File(codecFactory.removeSuffix(file, codec.getDefaultExtension())));
IOUtils.copyBytes(inputStream, outputStream, conf); }
}

打成jar包:Decodec.jar

[root@master liguodong]# yarn jar Decodec.jar
15/06/05 21:54:25 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
[root@master liguodong]# ll
总用量 524824
-rw-r--r-- 1 root root 1492 6月 5 19:47 codec.jar
-rw-r--r-- 1 root root 536870912 6月 5 21:54 data
-rw-r--r-- 1 root root 521844 6月 5 21:40 data.gz
上一篇:多目标跟踪(MOT)评测标准


下一篇:栈的理解以及如何计算程序所需栈的大小并在IAR中设置栈