[hdf-forum] HDF5 file 4x larger than ascii?

Richard van Hees R.M.van.Hees at sron.nl
Fri Jun 26 04:25:08 EDT 2009


Hi Werner,

Sorry, but I am not surprised, nor do I find it interesting. The
programs h5ls and h5dump simply do not show the incredible amount of the
duplicated meta data that you are storing in the HDF5 file. You wrote:
"I did not expect it (= HDF5) to be *that* inefficient"; HDF5 is not
inefficient, but simply writes to a file what you have asked it to do.

To improve your layout:
* Ask your self: "do I need the definition of the compound "point"?" If
so then write the definition of "point" to the root of the HDF5 file, or
even store the whole group Chart to the root, only once.
* Likely you do not need all the meta-data overhead if you switch to
the Packet Table API (H5PT) and write date and point into a table.
Alternatively, you could use the older Table API (H5TB). There are nice
example available on the HDF5 website

Good luck.

Richard


Werner Benger wrote:
> Now this is interesting: I got an HDF5 file which is 4x larger than
> its corresponding representation as "h5ls -rvd" or "h5dump". 
>
> The HDF5 file is 6MB, and available here:
>
> http://sciviz.cct.lsu.edu/data/h5path/path1.f5
>
> Its output by "h5ls -rvd" is 1.4MB:
>
> http://sciviz.cct.lsu.edu/data/h5path/path1.h5ls
>
> And "h5dump" on same file brings it to 1.5MB:
>
> http://sciviz.cct.lsu.edu/data/h5path/path1.h5dump
>
>
> I'm aware that this kind of data layout is inefficient for the
> data stored here; it consists of a time series of just three points
> at each time step, each of them stored in some subgroups.
>
> However, I did not expect it to be *that* inefficient such that the
> ascii dump is 4x smaller than the corresponding binary HDF5 file 
> (using HDF5 1.8.2-post13).
>
> It's not really a performance issue here, since the data file is
> still small, and the layout is intended for really large data where
> this metadata overhead will become neglible. Still I'm wondering if
> there would be a "sufficiently easy" way to reduce the file size
> significantly? Maybe there is some "pack all metadata together" property
> setting or similar?
>
> Cheers,
> 	Werner
>
>   


----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.





More information about the Hdf-forum mailing list