tellg() 函数给出错误的文件大小?

2021-12-09 00:00:00 file c++ ifstream

我做了一个示例项目来将文件读入缓冲区.当我使用 tellg() 函数时,它给了我一个比read函数实际上是从文件中读取的.我认为有一个错误.

I did a sample project to read a file into a buffer. When I use the tellg() function it gives me a larger value than the read function is actually read from the file. I think that there is a bug.

这是我的代码:

void read_file (const char* name, int *size , char*& buffer)
{
  ifstream file;

  file.open(name,ios::in|ios::binary);
  *size = 0;
  if (file.is_open())
  {
    // get length of file
    file.seekg(0,std::ios_base::end);
    int length = *size = file.tellg();
    file.seekg(0,std::ios_base::beg);

    // allocate buffer in size of file
    buffer = new char[length];

    // read
    file.read(buffer,length);
    cout << file.gcount() << endl;
   }
   file.close();
}

主要:

void main()
{
  int size = 0;
  char* buffer = NULL;
  read_file("File.txt",&size,buffer);

  for (int i = 0; i < size; i++)
    cout << buffer[i];
  cout << endl; 
}

推荐答案

tellg 不报告文件的大小,也不报告偏移量从字节开始.它报告一个令牌值,它可以后来习惯了寻求同一个地方,仅此而已.(甚至不能保证您可以将类型转换为整型.)

tellg does not report the size of the file, nor the offset from the beginning in bytes. It reports a token value which can later be used to seek to the same place, and nothing more. (It's not even guaranteed that you can convert the type to an integral type.)

至少根据语言规范:在实践中,在 Unix 系统上,返回的值将以字节为单位的偏移量从文件的开头,在 Windows 下,它将是文件开头的偏移量 对于打开的文件二进制模式.对于 Windows(和大多数非 Unix 系统),在文本中模式之间没有直接和直接的映射tellg 返回和必须读取的字节数才能到达那个位置.在 Windows 下,您真正??可以依靠的是该值将不小于您拥有的字节数阅读(在大多数实际情况下,不会太大,虽然最多可以增加两倍).

At least according to the language specification: in practice, on Unix systems, the value returned will be the offset in bytes from the beginning of the file, and under Windows, it will be the offset from the beginning of the file for files opened in binary mode. For Windows (and most non-Unix systems), in text mode, there is no direct and immediate mapping between what tellg returns and the number of bytes you must read to get to that position. Under Windows, all you can really count on is that the value will be no less than the number of bytes you have to read (and in most real cases, won't be too much greater, although it can be up to two times more).

如果确切知道您可以读取多少字节很重要,可靠地这样做的唯一方法是阅读.你应该能够使用以下方法执行此操作:

If it is important to know exactly how many bytes you can read, the only way of reliably doing so is by reading. You should be able to do this with something like:

#include <limits>

file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear();   //  Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );

最后,关于您的代码的另外两个评论:

Finally, two other remarks concerning your code:

第一行:

*buffer = new char[length];

不应编译:您已将 buffer 声明为 char*,所以 *buffer 的类型是 char,而不是一个指针.鉴于什么你似乎在做,你可能想将 buffer 声明为一个 char**.但更好的解决方案是声明它作为 std::vector&std::string&.(这样,你也不必返回大小,并且不会泄漏内存如果有例外.)

shouldn't compile: you have declared buffer to be a char*, so *buffer has type char, and is not a pointer. Given what you seem to be doing, you probably want to declare buffer as a char**. But a much better solution would be to declare it as a std::vector<char>& or a std::string&. (That way, you don't have to return the size as well, and you won't leak memory if there is an exception.)

其次,最后的循环条件不对.如果你真的想一次读一个字符,

Second, the loop condition at the end is wrong. If you really want to read one character at a time,

while ( file.get( buffer[i] ) ) {
    ++ i;
}

应该可以解决问题.更好的解决方案可能是读取数据块:

should do the trick. A better solution would probably be to read blocks of data:

while ( file.read( buffer + i, N ) || file.gcount() != 0 ) {
    i += file.gcount();
}

甚至:

file.read( buffer, size );
size = file.gcount();

我刚刚注意到第三个错误:如果您无法打开文件,你不告诉调用者.至少,你应该将 size 设置为 0(但某种更精确的错误处理可能更好).

I just noticed a third error: if you fail to open the file, you don't tell the caller. At the very least, you should set the size to 0 (but some sort of more precise error handling is probably better).

相关文章