■ Disk access: to buffer or not to buffer?

The other day I reported on a experiment of mine, that implied buffer size was not a very important factor in terms of file reading speed — as long as it was over 4096 bytes long, the corollary being "when you don't need to read the whole file, read it in small chunks". This heuristic makes sense for a file searching program, where a keyword seeked may be at the beginning of the file.

Some people rightly pointed out that this experiment was fatally flawed because it relied on repeated reading of the same file. When you read the same file 1000 times, ReadFile windows API will not read it from the disk except for the very first time; from the second read onwards, it would put the file in the memory cache, resulting in much faster read speeds. A better designed experiment is called for that takes the memory cache out of the picture.

In the revised experiment a single large file ~150MB was read just once with various buffer sizes in a simple loop as such:


	HANDLE hf = CreateFile("big.iso", GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 
		FILE_FLAG_SEQUENTIAL_SCAN, NULL); 
#define BUF_SIZE 4096
	BYTE buf[BUF_SIZE];
	clock_t now = clock();
	while(1) {
		ReadFile(hf, buf, BUF_SIZE, &actual, 0);
		if(!actual)
			break;
	}
	now = clock() - now;

The buffer size parameter BUF_SIZE was varied and timing figures reported. Of course the big problem is taking the windows cache out of the picture. How can this be done? a simple solution is to reboot the PC and run the timing code on "fresh" RAM — but that would be tedious and take ages. As I'm not getting any younger I opted for a virtual machine solution. I would restart the virtual machine each time and measure the reading speed there. Rebooting the virtual machine is easier, but of course it also has a virtual (that is not real) hard disk. So I don't expect a place in the Royal Academy of Engineering, but I believe that a point is being made.

The timing data for this non-cached disk read experiment are surprising. The buffer size is completely irrelevant, even for very small 512 byte I/O buffers. It took approximately 5 seconds (average of repeated experiments) to read 149,303,296 bytes directly off the disk. A second read (using the windows cache) proved 5 times faster (this is the speed reported in the previous experiment), but it again was independent of the buffer size used.

I have since learned that FILE_FLAG_SEQUENTIAL_SCAN is probably the culprit of this buffering behavior. It helps when one file chunk is read, worked upon, then the next chunk will be available immediately! (on account of the prefetch that the flag forces)

Assuming the virtual hard disk did not distort the experiment totally, the only reasonable explanation is that windows optimizes disk access behind the scenes with some fixed buffer size of its own, so the size passed to ReadFile is irrelevant.

The importance of this windows disk caching becomes apparent when one uses the CreateFile flag FILE_FLAG_NO_BUFFERING to bypass it. The I/O buffer size now becomes very important as you can see from the graph to the right. For 512 byte buffers it took 55 seconds (!) to read the same file, and although the speed improved with increasing buffer sizes, it never quite achieved the 5 second average of the windows optimized disk read.

The great hunt for the optimal I/O buffer size has reachad a rather anti-climax. Windows is doing such a good job with disk access that you can use any buffer size (even idiotic) for your program and it won't matter. And under no circumstances should you use FILE_FLAG_NO_BUFFERING or performance will get a hit.