[hdf-forum] h5py -- most efficient way to load a hdf5

Andrew Collette andrew.collette at gmail.com
Thu Jun 25 00:21:12 EDT 2009


Hi,

> ds[offset:offset+100] = arr
>
> that will load the entire set into memory which could be costly...

No, it will assign the contents of "arr" to a 100-element slice of the
dataset.  This is one of the features of HDF5; you don't have to load
the whole thing into memory to modify it.  As far as turning a line of
csv into a dataset, you need to have NumPy turn each line of text into
an array element, so it can be stored in the dataset.  It won't happen
automatically.  One function I bumped into that does that is this one:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromregex.html

Otherwise, you can use line.split(",") and manually convert each one,
as in your previous example.  It will be slow, but it will work.

Andrew

PS: We can continue this discussion in private email if you want.  I'm
not sure that Python-side CSV translation is a burning priority for
the HDF folks. :)

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.





More information about the Hdf-forum mailing list