[Hdf-forum] workarounds for dynamically allocated multidimensional arrays?
Quincey Koziol
koziol at hdfgroup.org
Wed Jan 20 11:37:14 EST 2010
Hi Darren,
On Jan 20, 2010, at 10:18 AM, Darren Adams wrote:
> Quincey,
> Ok, this sounds encouraging, I thought what I was doing made sense. When I ran the code with INDEPENDENT transfer property, the file actually contained some data, despite the errors I posted, and not gobbledy gook pointer addresses either--as far as I could tell. Switching to COLLECTIVE mode results in a seg. fault with no stderr output.
>
> I do believe the memory is contiguously allocated, at least that is the expressed intent of the developers. The calloc calls themselves are a bit convoluted, I haven't completely deciphered them to verify this claim. I pasted in the subroutine that allocates the 3d array below, in case you are interested.
Unfortunately, you are not making a single call to malloc/calloc, so you are going to have some serious complications if you want to create a selection to these elements. It _is_ possible, but you'd have to define your array as starting at offset 0 in memory, probably as a 1-D dataset, and then use multiple H5Sselect_hyperslab() calls to select the elements you want to perform I/O on.
> Right now, I can't login to Kraken to see which mpi I'm using, I think it is a Cray packaged (MPT) based on MPICH so I'm not sure what the feasibility of testing a new MPICH there would be, but I can certainly try it out on another system if need be. What version do I need?
Assuming that you've hit this bug (which I'm not certain of, now that you've shown that you aren't allocating a single buffer in memory), it would have to be a snapshot from the MPICH developers (i.e. it isn't released yet).
> The only workaround I've been able to get working so far is to do a hyperslab select and write for every 1D column which amounts to N^2 calls on a N^3 array. I have verified this to work with the H5T_COMPOUND datatype I am writing etc. but performance is quite slow.
>
> Thanks for the info.
>
> -Darren
>
> P.S. Can you post a link to the mpich bug?
Sure, here's the link to the MPICH bug tracker:
http://trac.mcs.anl.gov/projects/mpich2/ticket/972
Quincey
>
>
> ---------------- 3d calloc code snippet ----------------------------------
> void*** calloc_3d_array(size_t nt, size_t nr, size_t nc, size_t size)
> {
> void ***array;
> size_t i,j;
>
> if((array = (void ***)calloc(nt,sizeof(void**))) == NULL){
> ath_error("[calloc_3d] failed to allocate memory for %d 1st-pointers\n",
> (int)nt);
> return NULL;
> }
>
> if((array[0] = (void **)calloc(nt*nr,sizeof(void*))) == NULL){
> ath_error("[calloc_3d] failed to allocate memory for %d 2nd-pointers\n",
> (int)(nt*nr));
> free((void *)array);
> return NULL;
> }
>
> for(i=1; i<nt; i++){
> array[i] = (void **)((unsigned char *)array[0] + i*nr*sizeof(void*));
> }
>
> if((array[0][0] = (void *)calloc(nt*nr*nc,size)) == NULL){
> ath_error("[calloc_3d] failed to alloc. memory (%d X %d X %d of size %d)\n",
> (int)nt,(int)nr,(int)nc,(int)size);
> free((void *)array[0]);
> free((void *)array);
> return NULL;
> }
>
> for(j=1; j<nr; j++){
> array[0][j] = (void **)((unsigned char *)array[0][j-1] + nc*size);
> }
>
> for(i=1; i<nt; i++){
> array[i][0] = (void **)((unsigned char *)array[i-1][0] + nr*nc*size);
> for(j=1; j<nr; j++){
> array[i][j] = (void **)((unsigned char *)array[i][j-1] + nc*size);
> }
> }
>
> return array;
> }
>
> -----------------------------------------------------------------------------
>
> ----- Original Message -----
> From: "Quincey Koziol" <koziol at hdfgroup.org>
> To: hdf-forum at hdfgroup.org
> Sent: Wednesday, January 20, 2010 7:01:50 AM GMT -06:00 US/Canada Central
> Subject: Re: [Hdf-forum] workarounds for dynamically allocated multidimensional arrays?
>
> Hi Darren,
>
> On Jan 19, 2010, at 3:16 PM, Darren Adams wrote:
>
>> Hi,
>> I'm working with a mpi CFD code that uses dynamically allocated 3D arrays of structs to store data. I'm trying to develop a scalable parallel I/O routine using HDF5, or I suppose PHDF5. I've had some initial success using a 1D buffering approach, but I'm not crazy about this method. This approach leads to a H5Sselect_hyperspab and H5Dwrite call for every 1D "pencil" in the 3D space. Performance is far from optimal. My next attempt was to use HDF derived datatypes and try to write from the main data array directly. This seems much simpler, as I only need to select a hyperslab from memory that excludes the "ghost" cells, and then one from the file dataspace that offsets to the correct mpi processes' section, then--a single H5Dwrite!
>
> Yes, that's the preferred direction to go.
>
>> Cool, except that I kept getting low level I/O errors (below) which, after much googling, finally landed me on this forum post:
>>
>> http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2008-June/001118.html
>>
>> The jist of which is:
>>
>>> ... The HDF5 library expects to
>>> receive a contiguous array of elements, not pointers to elements in
>>> lower dimensions.
>>
>> The above forum post includes an alternate approach which is basically a select hyperspab and write for every 1D column, which is exactly what I wanted to avoid.
>>
>> Are there any other workarounds so that I can still only define a single hyperslab and call a single H5Dwrite call on each process?
>
> Is your buffer allocated in one contiguous block in memory? (I know you are sub-selecting pieces of that, but what I'm asking is whether it was allocated with a single "malloc" call, really)
>
> Quincey
>
> P.S. - You may be running into an MPICH error that we discovered and the MPICH team has addressed for their next release. Are you able to test with a new version of MPICH, also?
>
>> -Darren
>>
>> My error:
>>
>> HDF5-DIAG: Error detected in HDF5 (1.8.3) MPI-process 5:
>> #000: H5Dio.c line 266 in H5Dwrite(): can't write data
>> major: Dataset
>> minor: Write failed
>> #001: H5Dio.c line 578 in H5D_write(): can't write data
>> major: Dataset
>> minor: Write failed
>> #002: H5Dcontig.c line 539 in H5D_contig_write(): contiguous write failed
>> major: Dataset
>> minor: Write failed
>> #003: H5Dselect.c line 306 in H5D_select_write(): write error
>> major: Dataspace
>> minor: Write failed
>> #004: H5Dselect.c line 217 in H5D_select_io(): write error
>> major: Dataspace
>> minor: Write failed
>> #005: H5Dcontig.c line 1127 in H5D_contig_writevv(): block write failed
>> major: Low-level I/O
>> minor: Write failed
>> #006: H5Fio.c line 159 in H5F_block_write(): file write failed
>> major: Low-level I/O
>> minor: Write failed
>> #007: H5FDint.c line 185 in H5FD_write(): driver write request failed
>> major: Virtual File Layer
>> minor: Write failed
>> #008: H5FDmpio.c line 1820 in H5FD_mpio_write(): MPI_File_write_at failed
>> major: Internal error (too specific to document in detail)
>> minor: Some MPI function failed
>> #009: H5FDmpio.c line 1820 in H5FD_mpio_write(): Other I/O error , error stack:
>> ADIOI_GEN_WRITECONTIG(189): Other I/O error Bad address
>> major: Internal error (too specific to document in detail)
>> minor: MPI Error String
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum at hdfgroup.org
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum at hdfgroup.org
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum at hdfgroup.org
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
More information about the Hdf-forum
mailing list