[hdf-forum] Reading across multiple chunks is very slow

Francesc Alted faltet at pytables.org
Fri Mar 20 06:25:48 EDT 2009


A Friday 20 March 2009, Ger van Diepen escrigué:
> Hi Fransesc,
>
> The only thing I can think of is that reading [1,1978,1556,2]
> requires much more data shuffling. Effectively the 2 tiles have to be
> interleaved. In the other case the 2 tiles just have to be
> concatenated. But I doubt if that costs so much more time. Do you
> have the amount of user and system time it took?

Here you have:

time for two leading indices [2,1978,1556,1] --> 0.0165
real    0m0.128s
user    0m0.076s
sys     0m0.052s

time for two trailing indices [1,1978,1556,2] --> 2.529
real    0m2.642s
user    0m0.868s
sys     0m1.764s

So, yeah, it seems that the second selection takes much more user, but 
specially, system time.  However, reading trailing indices one by one 
seems much faster:

time for one trailing index (offset=0, count=1) --> 0.008329
real    0m0.122s
user    0m0.076s
sys     0m0.048s

time for one trailing index (offset=1, count=1) --> 0.00888
real    0m0.123s
user    0m0.092s
sys     0m0.028s

So, it definitely seems that something is suboptimal in HDF5 for this 
use case.

Thanks,

-- 
Francesc Alted

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.





More information about the Hdf-forum mailing list