[hdf-forum] Parallel hdf5 problem in 1.8.0 through 1.8.2 (fixed)

Robert Latham robl at mcs.anl.gov
Fri Mar 6 13:59:30 EST 2009


On Fri, Mar 06, 2009 at 06:17:37PM +0000, Ricardo Fonseca wrote:
> P.S -> (rob) Thanks for the input, this work is actually in preparation 
> for a BlueGene system. Could you tell me if that flag is set on your 
> configuration? If you just look into $H5DIR/include/H5pubconf.h around 
> line 431 you can check the definition of the  
> H5_MPI_COMPLEX_DERIVED_DATATYPE_WORKS macro.

This flag is indeed unset in my HDF5 config. 

I am quite interested in knowing more about just what datatype stuff
HDF5 wants to do that MPICH2 cannot handle.   If, as you observe, HDF5
tries to free an MPI_BYTE then maybe there's work to be done on both
ends.  

Also, since I'm poking around in nearby code, the check for
collective I/O working seems... weird to me?  By the MPI standard, all
processes in a communicator must call a collective routine if any of
them call a collective routine. If any of those processes have zero
bytes of I/O, that should be just fine.  If not, you've found another
bug in the MPI-IO implementation. 

Note how I'm not rushing off to test these settings myself.  I fully
understand time and resource constraints.  Just making a note of
things to look at "one day" if we need more parallel I/O performance.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.





More information about the Hdf-forum mailing list