Chapter 27: Input/Output

Chapter 27 deals with iostreams and all their subcomponents and extensions. All kinds of fun stuff.


Contents


Copying a file

So you want to copy a file quickly and easily, and most important, completely portably. And since this is C++, you have an open ifstream (call it IN) and an open ofstream (call it OUT):

   #include <fstream>

   std::ifstream  IN ("input_file");
   std::ofstream  OUT ("output_file"); 

Here's the easiest way to get it completely wrong:

   OUT << IN;
For those of you who don't already know why this doesn't work (probably from having done it before), I invite you to quickly create a simple text file called "input_file" containing the sentence
   The quick brown fox jumped over the lazy dog.
surrounded by blank lines. Code it up and try it. The contents of "output_file" may surprise you.

Seriously, go do it. Get surprised, then come back. It's worth it.


The thing to remember is that the basic_[io]stream classes handle formatting, nothing else. In particular, they break up on whitespace. The actual reading, writing, and storing of data is handled by the basic_streambuf family. Fortunately, the operator<< is overloaded to take an ostream and a pointer-to-streambuf, in order to help with just this kind of "dump the data verbatim" situation.

Why a pointer to streambuf and not just a streambuf? Well, the [io]streams hold pointers to their buffers, not the actual buffers. This allows polymorphic behavior on the part of the buffers as well as the streams themselves. The pointer is easily retrieved using the rdbuf() member function. Therefore, the easiest way to copy the file is:

   OUT << IN.rdbuf();

So what was happening with OUT<<IN? Undefined behavior, since that << isn't defined by the Standard. I have seen instances where it is implemented, but the character extraction process removes all the whitespace, leaving you with no blank lines and only "Thequickbrownfox...". With libraries that do not define that operator, IN (or one of IN's member pointers) sometimes gets converted to a void*, and the output file then contains a hexidecimal address (quite a big surprise). Others don't compile at all.

Also note that none of this is specific to o*f*streams. The operators shown above are all defined in the parent basic_ostream class and are therefore available with all possible descendents.

Return to top of page or to the FAQ.


The buffering is screwing up my program!

First, are you sure that you understand buffering? Particularly the fact that C++ may not, in fact, have anything to do with it?

The rules for buffering can be a little odd, but they aren't any different from those of C. (Maybe that's why they can be a bit odd.) Many people think that writing a newline to an output stream automatically flushes the output buffer. This is true only when the output stream is, in fact, a terminal and not a file or some other device -- and that may not even be true since C++ says nothing about files nor terminals. All of that is system-dependant. (The newline-buffer-flushing thing is mostly true on Unix systems, though.)

Some people also believe that sending endl down an output stream only writes a newline. This is incorrect; after a newline is written, the buffer is also flushed. Perhaps this is the effect you want when writing to a screen -- get the text out as soon as possible, etc -- but the buffering is largely wasted when doing this to a file:

   output << "a line of text" << endl;
   output << some_data_variable << endl;
   output << "another line of text" << endl; 
The proper thing to do in this case to just write the data out and let the libraries and the system worry about the buffering. If you need a newline, just write a newline:
   output << "a line of text\n"
          << some_data_variable << '\n'
          << "another line of text\n"; 
I have also joined the output statements into a single statement. You could make the code prettier by moving the single newline to the start of the quoted text on the thing line, for example.

If you do need to flush the buffer above, you can send an endl if you also need a newline, or just flush the buffer yourself:

   output << ...... << flush;    // can use std::flush manipulator
   output.flush();               // or call a member fn 

On the other hand, there are times when writing to a file should be like writing to standard error; no buffering should be done because the data needs to appear quickly (a prime example is a log file for security-related information). The way to do this is just to turn off the buffering before any I/O operations at all have been done, i.e., as soon as possible after opening:

   std::ofstream    os ("/foo/bar/baz");
   std::ifstream    is ("/qux/quux/quuux");
   int   i;

   os.rdbuf()->pubsetbuf(0,0);
   is.rdbuf()->pubsetbuf(0,0);
   ...
   os << "this data is written immediately\n";
   is >> i;   // and this sill probably cause a disk read 

Since all aspects of buffering are handled by a streambuf-derived member, it is necessary to get at that member with rdbuf(). Then the public version of setbuf can be called. The arguments are the same as those for the Standard C I/O Library function (a buffer area followed by its size).

A great deal of this is implementation-dependant. For example, streambuf does not specify any actions for its own setbuf()-ish functions; the classes derived from streambuf each define behavior that "makes sense" for that class: an argument of (0,0) turns off buffering for filebuf but has undefined behavior for its sibling stringbuf, and specifying anything other than (0,0) has varying effects. Other user-defined class derived from streambuf can do whatever they want.

A last reminder: there are usually more buffers involved than just those at the language/library level. Kernel buffers, disk buffers, and the like will also have an effect. Inspecting and changing those are system-dependant.

Return to top of page or to the FAQ.


Binary I/O

This will be finished as soon as I finish changing jobs...


Comments and suggestions are welcome, and may be sent to Phil Edwards or Gabriel Dos Reis.