[hdf-forum] Reading across multiple chunks is very slow
Francesc Alted
faltet at pytables.org
Thu Mar 19 14:35:06 EDT 2009
Hi,
A PyTables' user has reported a performance problem when reading a
dataset in some cases. I've tracked down the problem to the HDF5
library as the output of the attached script reveals:
Time for creating dataset with dims {3, 1978, 1556, 288} --> 0.000000
Time for writing hyperslice {2, 1978, 1556, 2} --> 12.010000
Time for reading hyperslice {2, 1978, 1556, 1} --> 0.020000
Time for reading hyperslice {1, 1978, 1556, 2} --> 2.490000
[This dataset has a chunksize of: {1, 1978, 1556, 1}]
The problem is: why it took 100x times more to read a hyperslice with a
count of {1, 1978, 1556, 2} than other with count {2, 1978, 1556, 1}?
I was trying to figure out what's happening, but as I can't realize a
clear explanation, I think that perhaps this is a bug in HDF5. I've
tried with HDF5 1.6.5, 1.8.2 and 1.8.2-post8, all with similar results.
Thanks,
--
Francesc Alted
-------------- next part --------------
A non-text attachment was scrubbed...
Name: read-performance-problem.c
Type: text/x-csrc
Size: 3913 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090319/6bdfa764/attachment.bin>
-------------- next part --------------
----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.
More information about the Hdf-forum
mailing list