[hdf-forum] setting chunk dimensions

Ruth Aydt aydt at hdfgroup.org
Tue Oct 21 22:19:45 EDT 2008


Hi Natalie,

You can think of the hyperslabs as the way you logically access (write  
or read) subsets of a complete dataset from your application's  
perspective.  By specifying different hyperslabs you can access  
different subsets of the dataset.   You can also access the entire  
dataset -- it just depends on what you specify in the write or read.

Chunked storage defines how the dataset is physically written to /  
read from disk.   The chunk size is set when the dataset is created  
and remains constant.  Typically you want to chose a chunk layout that  
will perform well for the most frequent logical access pattern -- or  
for the access pattern that you want the best performance with.

So hyberslabs are about logical access and chunks are about physical  
storage organization on disk.   Both hyperslabs and chunks will have  
the same number of dimensions as the dataset.  But, the dimension  
*sizes* for both hyberslabs and chunks may be (and usually are)  
different than your dataset's dimension sizes.

The interaction of chunk sizes, hyperslab selections, and various  
other factors can dramatically impact performance.

You may be interested in sections 4.1 and 5 of the NetCDF-4  
Performance Report found at www.hdfgroup.org/pubs/papers.   They give  
some explanation about hyperslabs and chunked storage, and how  
performance may vary, as well as how chunked storage may impact  
filesize.

-Ruth



On Oct 21, 2008, at 3:10 AM, Natalie Happenhofer wrote:

> Hi!
> I´m trying to write my data via hyperslabs, ad there is also a nice  
> example how to do it on the HDF5.org webpage. I just don´t  
> understand how to set the chunk_dims, or, more precisely, what do  
> this chunking dimensions do?
> Here is the part of the example code using the chunk-dims:
>
> nt
> main (void)
> {
>     hid_t       file;                          /* handles */
>     hid_t       dataspace, dataset;
>     hid_t       filespace;
>     hid_t       cparms;
>     hsize_t      dims[2]  = { 3, 3};            /*
>                          * dataset dimensions
>                          * at the creation time
>                          */
>     hsize_t      dims1[2] = { 3, 3};            /* data1 dimensions */
>     hsize_t      dims2[2] = { 7, 1};            /* data2 dimensions */
>
>
>     hsize_t      dims3[2] = { 2, 2};            /* data3 dimensions */
>
>     hsize_t      maxdims[2] = {H5S_UNLIMITED, H5S_UNLIMITED};
>     hsize_t      chunk_dims[2] ={2, 5};
>     hsize_t      size[2];
>     hsize_t      offset[2];
>
>     herr_t      status;
>
>     int         data1[3][3] = { {1, 1, 1},       /* data to write */
>                 {1, 1, 1},
>                 {1, 1, 1} };
>
>     int         data2[7]    = { 2, 2, 2, 2, 2, 2, 2};
>
>     int         data3[2][2] = { {3, 3},
>                 {3, 3} };
>     int fillvalue = 0;
>
>     /*
>      * Create the data space with unlimited dimensions.
>      */
>     dataspace = H5Screate_simple(RANK, dims, maxdims);
>
>     /*
>      * Create a new file. If file exists its contents will be  
> overwritten.
>      */
>     file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT,  
> H5P_DEFAULT);
>
>     /*
>      * Modify dataset creation properties, i.e. enable chunking.
>      */
>     cparms = H5Pcreate(H5P_DATASET_CREATE);
>     status = H5Pset_chunk( cparms, RANK, chunk_dims);
>     status = H5Pset_fill_value (cparms, H5T_NATIVE_INT, &fillvalue );
>
>
>
> chunk_dims is set to {2,5}, which I don´t understand, because the  
> initial dataset is 3x3 and is then extended to 10x3 - why the {2,5}?
>
> thx,
> NH
>
>
>      * Create a new dataset within the file using cparms
>      * creation properties.
>      */
>     dataset = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT,  
> dataspace, H5P_DEFAULT,
>             cparms, H5P_DEFAULT);
>
> Express yourself instantly with MSN Messenger! MSN Messenger

------------------------------------------------------------
Ruth Aydt
The HDF Group
1901 South First Street,  Suite C-2
Champaign, IL 61820

aydt at hdfgroup.org      (217)265-7837
------------------------------------------------------------



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20081021/b50842be/attachment.html>


More information about the Hdf-forum mailing list