[Hdf-forum] Chunk cache size and performance

Rob Latham robl at mcs.anl.gov
Fri Jan 8 10:10:01 EST 2010


On Thu, Jan 07, 2010 at 08:30:57PM +0100, Francesc Alted wrote:

> What I want to stress during the workshop is the dependency of I/O throughput 
> on the chunksize for a certain dataset.  For making the plots that I've got 
> (attached), I have chosen a dataset of 2 GB (2-dim, shape is (512, 65536) and 
> datatype is double precision) so that it can easily fit into my OS cache 
> memory (my machine has 8 GB) and make the effects clearer.  In the X axis, I 
> represent the chunksize for every dataset (from 1 KB up to 8 MB).  In the Y 
> axis there is the performance for reading the dataset sequentially.

I'd appreciate a bit more explanation of your methodology.  You want
to test *I/O throughput* but at the same time you want to make sure
the data fits in memory cache.  Are you not then just testing memory
bandwidth?

If I were running this benchmark I would be purging the memory cache
between every run: the chunk cache is designed to improve disk
performance, right?

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA



More information about the Hdf-forum mailing list