[hdf-forum] Reading across multiple chunks is very slow
Francesc Alted
faltet at pytables.org
Fri Mar 20 06:25:48 EDT 2009
A Friday 20 March 2009, Ger van Diepen escrigué:
> Hi Fransesc,
>
> The only thing I can think of is that reading [1,1978,1556,2]
> requires much more data shuffling. Effectively the 2 tiles have to be
> interleaved. In the other case the 2 tiles just have to be
> concatenated. But I doubt if that costs so much more time. Do you
> have the amount of user and system time it took?
Here you have:
time for two leading indices [2,1978,1556,1] --> 0.0165
real 0m0.128s
user 0m0.076s
sys 0m0.052s
time for two trailing indices [1,1978,1556,2] --> 2.529
real 0m2.642s
user 0m0.868s
sys 0m1.764s
So, yeah, it seems that the second selection takes much more user, but
specially, system time. However, reading trailing indices one by one
seems much faster:
time for one trailing index (offset=0, count=1) --> 0.008329
real 0m0.122s
user 0m0.076s
sys 0m0.048s
time for one trailing index (offset=1, count=1) --> 0.00888
real 0m0.123s
user 0m0.092s
sys 0m0.028s
So, it definitely seems that something is suboptimal in HDF5 for this
use case.
Thanks,
--
Francesc Alted
----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.
More information about the Hdf-forum
mailing list