[Hdf-forum] Chunk cache size and performance
Francesc Alted
faltet at pytables.org
Thu Jan 7 14:30:57 EST 2010
Hi,
I'm doing some small benchmarks to present on a forthcoming workshop, and I'd
be very grateful if someone can explain shed some light on the performance
figures that I'm getting.
What I want to stress during the workshop is the dependency of I/O throughput
on the chunksize for a certain dataset. For making the plots that I've got
(attached), I have chosen a dataset of 2 GB (2-dim, shape is (512, 65536) and
datatype is double precision) so that it can easily fit into my OS cache
memory (my machine has 8 GB) and make the effects clearer. In the X axis, I
represent the chunksize for every dataset (from 1 KB up to 8 MB). In the Y
axis there is the performance for reading the dataset sequentially.
Now, for for a chunk cache size of 1 MB (figure 'sequential-1MB.pdf'), it can
be seen that HDF5 can read at up to 1.6 GB/s (which is pretty good :-).
However, if I raise the chunk cache size to 8 MB, the peak performance falls
down to a mere 1.0 GB/s, that is, almost a 40% less. The Blosc compressor
performance is also very affected by this (slower compressors like LZO or Zlib
does not notice this effect very much because they are the obvious
bottleneck).
I've tried with other cache sizes, and the smaller, the better (reaching a
performance of almost 1.9 GB/s on my machine for a cache size of 128 KB).
Varying the number of slots in cache does not seem to affect performance too
much here.
My guess is that the guilty of this significant performance penalty is the
chunk cache subsystem of HDF5. Could anyone confirm this? In case this is
true, do you think that some optimization in that regard could be carried out
in the future?
Thanks,
--
Francesc Alted
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sequential-8MB.pdf
Type: application/pdf
Size: 18698 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20100107/7c29e565/attachment-0002.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sequential-1MB.pdf
Type: application/pdf
Size: 18704 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20100107/7c29e565/attachment-0003.pdf>
More information about the Hdf-forum
mailing list