[hdf-forum] Reading across multiple chunks is very slow

Francesc Alted faltet at pytables.org
Thu Mar 19 14:35:06 EDT 2009


Hi,

A PyTables' user has reported a performance problem when reading a 
dataset in some cases.  I've tracked down the problem to the HDF5 
library as the output of the attached script reveals:

Time for creating dataset with dims {3, 1978, 1556, 288} --> 0.000000
Time for writing hyperslice {2, 1978, 1556, 2} --> 12.010000
Time for reading hyperslice {2, 1978, 1556, 1} --> 0.020000
Time for reading hyperslice {1, 1978, 1556, 2} --> 2.490000

[This dataset has a chunksize of: {1, 1978, 1556, 1}]

The problem is: why it took 100x times more to read a hyperslice with a 
count of {1, 1978, 1556, 2} than other with count {2, 1978, 1556, 1}?

I was trying to figure out what's happening, but as I can't realize a 
clear explanation, I think that perhaps this is a bug in HDF5.  I've 
tried with HDF5 1.6.5, 1.8.2 and 1.8.2-post8, all with similar results.

Thanks,

-- 
Francesc Alted
-------------- next part --------------
A non-text attachment was scrubbed...
Name: read-performance-problem.c
Type: text/x-csrc
Size: 3913 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090319/6bdfa764/attachment.bin>
-------------- next part --------------
----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.


More information about the Hdf-forum mailing list