我正在使用LZ4来压缩和解压缩字符串.我尝试了以下方法
public class CompressionDemo {
public static byte[] compressLZ4(LZ4Factory factory, String data) throws IOException {
final int decompressedLength = data.getBytes().length;
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
byte[] compressed = new byte[maxCompressedLength];
compressor.compress(data.getBytes(), 0, decompressedLength, compressed, 0, maxCompressedLength);
return compressed;
}
public static String deCompressLZ4(LZ4Factory factory, byte[] data) throws IOException {
LZ4FastDecompressor decompressor = factory.fastDecompressor();
byte[] restored = new byte[data.length];
decompressor.decompress(data,0,restored, 0,data.length);
return new String(restored);
}
public static void main(String[] args) throws IOException, DataFormatException {
String string = "kjshfhshfashfhsakjfhksjafhkjsafhkjashfkjhfjkfhhjdshfhhjdfhdsjkfhdshfdskjfhksjdfhskjdhfkjsdhfk";
LZ4Factory factory = LZ4Factory.fastestInstance();
byte[] arr = compressLZ4(factory, string);
System.out.println(arr.length);
System.out.println(deCompressLZ4(factory, arr) + "decom");
}
}
它给予了以下的激励
线程“main”中的异常net.jpountz.lz4.LZ4Exception:解码输入缓冲区的偏移量92时出错
这里的问题是只有当我传递实际的String byte []长度时,解压缩才有效
public static String deCompressLZ4(LZ4Factory factory, byte[] data) throws IOException {
LZ4FastDecompressor decompressor = factory.fastDecompressor();
byte[] restored = new byte[data.length];
decompressor.decompress(data,0,restored, 0,"kjshfhshfashfhsakjfhksjafhkjsafhkjashfkjhfjkfhhjdshfhhjdfhdsjkfhdshfdskjfhksjdfhskjdhfkjsdhfk".getBytes().length);
return new String(restored);
}
它期望实际的字符串byte []大小.
有人可以帮我弄这个吗
解决方法:
由于压缩和解压缩可能发生在不同的机器上,或者机器默认字符编码不是Unicode格式之一,因此也应该指示编码.
对于其余部分,它使用实际的压缩和解压缩长度,并且更好地以普通格式存储未压缩数据的大小,因此可以在解压缩之前提取它.
public static byte[] compressLZ4(LZ4Factory factory, String data) throws IOException {
byte[] decompressed = data.getBytes(StandardCharsets.UTF_8).length;
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength = compressor.maxCompressedLength(decompressed.length);
byte[] compressed = new byte[4 + maxCompressedLength];
int compressedSize = compressor.compress(decompressed, 0, decompressed.length,
compressed, 4, maxCompressedLength);
ByteBuffer.wrap(compressed).putInt(decompressed.length);
return Arrays.copyOf(compressed, 0, 4 + compressedSize);
}
public static String deCompressLZ4(LZ4Factory factory, byte[] data) throws IOException {
LZ4FastDecompressor decompressor = factory.fastDecompressor();
int decrompressedLength = ByteBuffer.wrap(data).getInt();
byte[] restored = new byte[decrompressedLength];
decompressor.decompress(data, 4, restored, 0, decrompressedLength);
return new String(restored, StandardCharsets.UTF_8);
}
应该告诉我,String不适合二进制数据,压缩/解压缩仅用于文本处理. (字符串包含UTF-16双字节字符形式的Unicode文本.转换为二进制数据总是涉及二进制数据编码的转换.这会降低内存,速度和可能的数据损坏.)