[hdf-forum] Reading across multiple chunks is very slow
Neil Fortner
nfortne2 at hdfgroup.org
Tue Mar 31 15:05:51 EDT 2009
Francesc,
Francesc Alted wrote:
> A Friday 20 March 2009, Ger van Diepen escrigué:
>
>> Hi Fransesc,
>>
>> The only thing I can think of is that reading [1,1978,1556,2]
>> requires much more data shuffling. Effectively the 2 tiles have to be
>> interleaved. In the other case the 2 tiles just have to be
>> concatenated. But I doubt if that costs so much more time. Do you
>> have the amount of user and system time it took?
>>
>
> Here you have:
>
> time for two leading indices [2,1978,1556,1] --> 0.0165
> real 0m0.128s
> user 0m0.076s
> sys 0m0.052s
>
> time for two trailing indices [1,1978,1556,2] --> 2.529
> real 0m2.642s
> user 0m0.868s
> sys 0m1.764s
>
> So, yeah, it seems that the second selection takes much more user, but
> specially, system time. However, reading trailing indices one by one
> seems much faster:
>
> time for one trailing index (offset=0, count=1) --> 0.008329
> real 0m0.122s
> user 0m0.076s
> sys 0m0.048s
>
> time for one trailing index (offset=1, count=1) --> 0.00888
> real 0m0.123s
> user 0m0.092s
> sys 0m0.028s
>
> So, it definitely seems that something is suboptimal in HDF5 for this
> use case.
>
> Thanks,
>
This is happening because (in the "time for two trailing indices" case)
the individual chunks are not contiguous in memory, as Ger pointed out.
Also, because the chunks are larger than the chunk cache size (default=1
MB), the library makes a best effort to avoid having to allocate enough
memory in the chunk. Therefore it reads directly from the file into the
supplied read buffer. Because the selection in the read buffer (for
each chunk) is a series of small non-contiguous blocks, the library must
make a large number of small reads.
To improve performance, you can increase the chunk cache size with
H5Pset_cache (or the new H5Pset_chunk_cache function if you're using the
latest snapshot). The test runs in about .7 seconds with this change on
my laptop, down from ~30 seconds. This is still more time than for the
contiguous case, because the library must allocate the extra space and
scatter each element individually from the cache to the read buffer, but
now only calls read once for each chunk.
I hope this is helpful,
-Neil
----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.
More information about the Hdf-forum
mailing list