[hdf-forum] Reading across multiple chunks is very slow

Elena Pourmal epourmal at hdfgroup.org
Sat Mar 21 19:23:52 EDT 2009


Hi Francesc,

Noted. We will take a look. Thanks for reporting!

Elena
On Mar 19, 2009, at 1:35 PM, Francesc Alted wrote:

> Hi,
>
> A PyTables' user has reported a performance problem when reading a
> dataset in some cases.  I've tracked down the problem to the HDF5
> library as the output of the attached script reveals:
>
> Time for creating dataset with dims {3, 1978, 1556, 288} --> 0.000000
> Time for writing hyperslice {2, 1978, 1556, 2} --> 12.010000
> Time for reading hyperslice {2, 1978, 1556, 1} --> 0.020000
> Time for reading hyperslice {1, 1978, 1556, 2} --> 2.490000
>
> [This dataset has a chunksize of: {1, 1978, 1556, 1}]
>
> The problem is: why it took 100x times more to read a hyperslice  
> with a
> count of {1, 1978, 1556, 2} than other with count {2, 1978, 1556, 1}?
>
> I was trying to figure out what's happening, but as I can't realize a
> clear explanation, I think that perhaps this is a bug in HDF5.  I've
> tried with HDF5 1.6.5, 1.8.2 and 1.8.2-post8, all with similar  
> results.
>
> Thanks,
>
> -- 
> Francesc Alted
> <read-performance- 
> problem 
> .c 
> > 
> ----------------------------------------------------------------------
> This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org 
> .
> To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.


----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.





More information about the Hdf-forum mailing list