概述:
今天尝试使用c++中的ifstream来读取一个zip文件,结果发现每次都是读取了451个字节就结束了(测试用的zip文件4M多)。
--------------------------------------------------
author: cs_cjl
website: http://blog.csdn.net/cs_cjl
--------------------------------------------------
author: cs_cjl
website: http://blog.csdn.net/cs_cjl
--------------------------------------------------
测试代码:
#include <iostream>
#include <fstream>
using namespace std;
int main (int argc, char *argv[])
{
ifstream fs(L"d:/tic.zip", std::ios::binary);
if (fs.is_open ()) {
cout << "file is open" << endl;
}
if (fs.good ()) {
cout << "filestream is good" << endl;
}
char buf[200];
size_t total_size(0);
while (true) {
fs.read(buf, 200);
total_size += fs.gcount ();
if (!fs) {
cout << "read " << fs.gcount () << endl;
cout << "fs.good () : " << fs.good () << endl;
cout << "fs.eof () : " << fs.eof () <<endl;
cout << "fs.fail () : " << fs.fail () << endl;
break;
}
}
cout << "read total size: " << total_size << endl;
return 0;
}
#include <fstream>
using namespace std;
int main (int argc, char *argv[])
{
ifstream fs(L"d:/tic.zip", std::ios::binary);
if (fs.is_open ()) {
cout << "file is open" << endl;
}
if (fs.good ()) {
cout << "filestream is good" << endl;
}
char buf[200];
size_t total_size(0);
while (true) {
fs.read(buf, 200);
total_size += fs.gcount ();
if (!fs) {
cout << "read " << fs.gcount () << endl;
cout << "fs.good () : " << fs.good () << endl;
cout << "fs.eof () : " << fs.eof () <<endl;
cout << "fs.fail () : " << fs.fail () << endl;
break;
}
}
cout << "read total size: " << total_size << endl;
return 0;
}
通过16进制编辑器查看zip文件,发现第252个字节为0x1a,通过查看
发现它对应Ctrl+Z,由于历史原因,在字符模式下
当遇到这个字符时,读取会结束
结论:
一直以为 二进制模式 和 字符模式 的区别只是对换行符\r \n的处理的不同
通过这次测试发现除了对换行符的处理不同外,字符模式还会对一些控制字符进行处理
参考:
wikipedia ASCII: http://en.wikipedia.org/wiki/Ascii
* Line reading chokes on 0x1A http://*.com/questions/405058/line-reading-chokes-on-0x1a/405169#405169