Home » Blog
date 23.Jun.2013

■ Optimal buffer size for windows ReadFile API


Everybody knows that reading and writing data from/to file storage (e.g. in a hard disk) is much slower than accessing RAM memory. That's why higher level file management using C++ streams or the fprintf() C function use memory buffering behind the scenes so as to minimize disk access and improve I/O performance. What kind of criteria are used for these buffered functions, and how they adapt to disk cluster sizes is anybody's guess.

In windows programming and for maximum efficiency folks use the native file API like ReadFile, where there is no fixed buffer size. The question is, what's the best buffer size for reading a file sequentially from beginning to end? This calls for a controlled experiment. I used a JPG file of 1.13 MB size, and read it in a few times with various buffer sizes. Reading a 1MB file nowadays is done in no time so to obtain meaningful results I read the file many times repeatedly. The file was opened as such:

  CreateFile("pic.jpg", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, NULL);

Buffer
(bytes)
NTFS (ms)
10 reps
NTFS (ms)
1000 reps
FAT (ms)
10 reps
FAT (ms)
1000 reps
NO_BUFFERING
NTFS, 10 reps
1652421778
322621889
641310453
128640218
256327125
51217262
10247847
20484719661622621622
4096311107151514936
819215702161170499
16384051501060421
32768039015936281
655361634316858234
13107203120843203
262144028116780218
52428802970811203
1048576  1631215920234

Table 1. Reading speed (ms) of a 1MB file depending on buffer size used

The timings in table 1 take into account the disk formatting; the first half of the table is for NTFS (4096 bytes cluster size) and the second half is for a FAT USB stick (8192 bytes cluster size). Obviously the hardware is different so there's not much point comparing NTFS with FAT here. It is clear that reading the picture with a small buffer, say 16 bytes at a time is very slow — despite Windows internal buffering. Increasing the buffer size from 16 to 2048 bytes results in a 100-fold increase in read speed!

As the buffer reaches the disk cluster size (4096 bytes and above) the reading speed is near instantaneous. Reading the file 10 times occurs in no time, so to get better results I read the file 1000 times (see the 3rd and 5th columns in table 1). We can see that there are some differences up to the 32KB buffer size, but from then onwards the speed remains the same (within the accuracy provided by GetTickCount).

What do we infer from these timing results? If you are reading a file sequentially, then the bigger the buffer size the merrier. Do not use FILE_FLAG_NO_BUFFERING flag which disables the internal windows buffering. As you can see from the last column, the performance is dreadful, 100 times slower than buffered I/O (it takes as long to read 10 times the file as 1000 reads in the buffered case).

Here's another twist: say you are scanning a file to find some text (keyword) in it. The keyword may happen to be at the beginning of the file or at the very end (or it may be absent!). What's the best reading strategy in this situation where we don't want to read the whole file? The maximum sensible chunk size in this case is 32KB. The reading speed is identical as if we were to read the whole file, but if we happen to hit the keyword at the first chunk, we saved ourselves tons of time reading the remaining bytes in the file. In fact, as disk access is orders of magnitude slower than memory access, I recommend reading only 8KB at a time; whatever we lose in disk access speed we gain by the chance of finding the keyword earlier in the file. This opportunistic strategy is used in xplorer².

And now you know why some file managers are better than others — and that's scientifically proven <g>

Post a comment on this topic »

Share |

©2002-2013 ZABKAT, all rights reserved | Privacy policy | Sitemap