[hdf-forum] Re: Question about meta data for chunked datasets

Mark Howison MHowison at lbl.gov
Mon Mar 23 19:07:11 EDT 2009


Hi Rob,

Here is a plot of 4 procs showing three tests (source code attached):

1) truncate to 2GB
2) truncate to 2250776576
3) each proc writes a byte, then truncate to 2250776576

The truncate (purple) seems to take about the same time for each, but
is on the same time scale as the opens and closes (green and brown).

In my other case, the truncate is taking orders of magnitude longer
than the open/close, which must be a peculiarity of that IO pattern.

Mark

On Mon, Mar 23, 2009 at 12:46 PM, Rob Latham <robl at mcs.anl.gov> wrote:
> On Mon, Mar 23, 2009 at 12:05:20PM -0700, Mark Howison wrote:
>> However, as you can see in the attached plot, the truncate (purple) at
>> the end is still taking up a substantial amount of the total IO time
>> for this test application. For now, I will probably disable the
>> truncate directly in the MPI-POSIX VFD code, like Noel Keen has done,
>> but in the long term we should figure out why it is there and when it
>> is necessary. Hopefully, the lustre/HDF5 funding will come through
>> soon!
>
> Maybe HDF5 needs to truncate, maybe it doesn't.  But if I'm reading
> your plot right, only one process is calling truncate.   Sounds to me
> like you've found a Lustre issue.
>
> What does lustre do if you run a standalone program that calls
> ftruncate to create a 2GB file?   To create a 2250776576 byte file? If
> you do a few writes before calling ftruncate?
>
> ==rob
>
>> Thanks
>> Mark
>>
>> ifi=5 -1 open64("../output/prs.h5part",2,-1) 3.47368e+01 2.34790e-02
>> ifi=5 41 open64("../output/prs.h5part",578,-1) 3.47642e+01 1.86651e-02
>> ifi=5 0 lseek64(41,0,0) 3.47932e+01 2.14577e-06
>> ifi=5 96 write(41,0x7fffffffb6e0,96) 3.47932e+01 1.29604e-03
>> ifi=5 1048576 lseek64(41,1048576,0) 3.48958e+01 3.09944e-06
>> ifi=5 1757600 write(41,0x37137d40,1757600) 3.48958e+01 1.58372e+00
>> ifi=5 1757600 write(41,0x372e4ee0,1757600) 3.64796e+01 4.77600e-01
>> ifi=5 1757600 write(41,0x37492080,1757600) 3.69572e+01 1.23870e-02
>> ifi=5 1757600 write(41,0x3763f220,1757600) 3.69696e+01 3.26340e-02
>> ifi=5 96 lseek64(41,96,0) 4.13958e+01 1.90735e-06
>> ifi=5 40 write(41,0x5d01aaa8,40) 4.13959e+01 4.13990e-03
>> ifi=5 544 write(41,0x5d01a548,544) 4.14000e+01 1.71661e-05
>> ifi=5 120 write(41,0x5d01b118,120) 4.14000e+01 1.50204e-05
>> ifi=5 40 write(41,0x5d01bc78,40) 4.14001e+01 1.50204e-05
>> ifi=5 544 write(41,0x5d01a548,544) 4.14001e+01 1.47820e-05
>> ifi=5 120 write(41,0x5d01c1c8,120) 4.14001e+01 1.38283e-05
>> ifi=5 328 write(41,0x7fffffffb630,328) 4.14001e+01 1.50204e-05
>> ifi=5 40 write(41,0x5d024248,40) 4.14001e+01 1.50204e-05
>> ifi=5 544 write(41,0x5d01a548,544) 4.14002e+01 1.50204e-05
>> ifi=5 120 write(41,0x5d024798,120) 4.14002e+01 1.50204e-05
>> ifi=5 328 write(41,0x7fffffffb630,328) 4.14002e+01 1.50204e-05
>> ifi=5 40 write(41,0x5d026ae8,40) 4.14002e+01 1.40667e-05
>> ifi=5 544 write(41,0x5d01a548,544) 4.14002e+01 1.50204e-05
>> ifi=5 120 write(41,0x5d0270d8,120) 4.14003e+01 1.50204e-05
>> ifi=5 328 write(41,0x7fffffffb630,328) 4.14003e+01 1.40667e-05
>> ifi=5 272 write(41,0x5d02b0d8,272) 4.14003e+01 1.54972e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14005e+01 1.62125e-05
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14005e+01 1.54018e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14007e+01 1.49965e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14009e+01 1.52111e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14010e+01 1.69277e-05
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14011e+01 1.52111e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14012e+01 1.53065e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14014e+01 1.53065e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14016e+01 1.59740e-05
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14016e+01 1.51157e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14018e+01 1.52111e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14020e+01 1.52111e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14021e+01 1.59740e-05
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14022e+01 1.49965e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14023e+01 1.50919e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14025e+01 1.53065e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14027e+01 1.50919e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14028e+01 1.59740e-05
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14029e+01 1.49965e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14030e+01 1.48058e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14032e+01 1.49965e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14034e+01 1.69277e-05
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14034e+01 1.51873e-04
>> ifi=5 3136 write(41,0x5d02a118,3136) 4.14036e+01 1.52826e-04
>> ifi=5 328 write(41,0x7fffffffb630,328) 4.14037e+01 1.59740e-05
>> ifi=5 0 lseek64(41,0,0) 4.14039e+01 9.53674e-07
>> ifi=5 96 write(41,0x7fffffffb4e0,96) 4.14039e+01 1.69277e-05
>> ifi=5 0 ftruncate64(41,2,250,776,576) 4.14039e+01 1.06910e+00
>> ifi=5 0 lseek64(41,0,0) 4.24736e+01 3.09944e-06
>> ifi=5 96 write(41,0x7fffffffb4a0,96) 4.24737e+01 4.69685e-05
>> ifi=5 0 close(41) 4.24739e+01 3.16906e-03
>>
>>
>> On Tue, Feb 17, 2009 at 9:19 AM, Quincey Koziol <koziol at hdfgroup.org> wrote:
>> > Hi Mark,
>> >
>> > On Feb 13, 2009, at 3:41 PM, Mark Howison wrote:
>> >
>> >> Also, here is a graph showing that same activity on node 0 (the first
>> >> row of pixels). The color key is:
>> >>
>> >> blue = write
>> >> dark purple = truncate
>> >> purple = fsync
>> >> teal = fflush
>> >>
>> >> Mark
>> >>
>> >>
>> >> On Fri, Feb 13, 2009 at 12:03 PM, Mark Howison <MHowison at lbl.gov> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> I have a parallel HDF5 application that is writing out chunked data to
>> >>> a 3D dataset and is exhibiting a large number of small writes upon
>> >>> closing the file. Below I've attached a trace of POSIX calls on node 0
>> >>> showing the file open, then 4 chunks of size 1757600 bytes being
>> >>> written, then a series of 40 - 3136 byte writes (mostly 3136), and
>> >>> then a truncate call before the file is closed. The small writes are
>> >>> not ideal because this is a lustre file system on a Cray XT at NERSC.
>> >>> Together, those small writes and truncate take about 30% of the time
>> >>> from file open to close.
>> >>>
>> >>> My hypothesis is that the small writes represent meta data related to
>> >>> the chunk indexing. Does that sound right?
>> >
>> >        Yes, that's probably correct.
>> >
>> >>> What is the best way for me to consolidate these small writes into one
>> >>> large write? Should I use
>> >>> H5Pset_meta_block_size() to set the block size to the lustre stripe
>> >>> width of 1MB?
>> >
>> >        Yes, that would probably help.
>> >
>> >>> I'm a little concerned by the fact that the 3136 byte
>> >>> writes are not to contiguous offsets, and perhaps cannot be
>> >>> consolidated into a single write.
>> >>>
>> >>> What is the purpose of the truncate? Can it be removed?
>> >
>> >        I think with some analysis we could eliminate the truncate in
>> > some/all cases, but we'll need to finish getting funding in place to work on
>> > these issues with Lustre.
>> >
>> >        Quincey
>> >
>> >>> Thanks,
>> >>>
>> >>> Mark Howison
>> >>> mhowison at lbl.gov
>> >>> Student Research Assistant
>> >>> Visualization Group, Lawrence Berkeley National Labs
>> >>>
>> >>>
>> >>> ifi=5 41 open64("../output/prs.h5part",66,-1) 3.26833e+01 6.02412e-03
>> >>> ifi=5 0 close(41) 3.26894e+01 1.08004e-04
>> >>> ifi=5 41 open64("../output/prs.h5part",2,-1) 3.26895e+01 3.85680e-02
>> >>> ifi=5 0 lseek64(41,0,2) 3.27325e+01 1.83105e-03
>> >>> ifi=5 0 lseek64(41,0,0) 3.27344e+01 9.53674e-07
>> >>> ifi=5 96 write(41,0x7fffffffb740,96) 3.27358e+01 3.38793e-04
>> >>> ifi=5 7304 lseek64(41,7304,0) 3.27391e+01 3.09944e-06
>> >>> ifi=5 1757600 write(41,0x371cdc80,1757600) 3.27391e+01 9.63148e-01
>> >>> ifi=5 1757600 write(41,0x3737ae20,1757600) 3.37023e+01 2.74949e-02
>> >>> ifi=5 1757600 write(41,0x37527fc0,1757600) 3.37299e+01 1.32360e-02
>> >>> ifi=5 1757600 write(41,0x376d5160,1757600) 3.37432e+01 1.96590e-02
>> >>> ifi=5 96 lseek64(41,96,0) 3.45493e+01 1.90735e-06
>> >>> ifi=5 40 write(41,0x5d0b8188,40) 3.45493e+01 1.57619e-03
>> >>> ifi=5 544 write(41,0x5d0b7ca8,544) 3.45509e+01 1.69277e-05
>> >>> ifi=5 120 write(41,0x5d0b87f8,120) 3.45510e+01 1.50204e-05
>> >>> ifi=5 40 write(41,0x5d0b9308,40) 3.45510e+01 1.38283e-05
>> >>> ifi=5 544 write(41,0x5d0b7ca8,544) 3.45510e+01 1.50204e-05
>> >>> ifi=5 120 write(41,0x5d0b9858,120) 3.45510e+01 1.38283e-05
>> >>> ifi=5 328 write(41,0x7fffffffb660,328) 3.45510e+01 1.59740e-05
>> >>> ifi=5 40 write(41,0x5d0c18d8,40) 3.45511e+01 1.40667e-05
>> >>> ifi=5 544 write(41,0x5d0b7ca8,544) 3.45511e+01 1.50204e-05
>> >>> ifi=5 120 write(41,0x5d0c1ed8,120) 3.45511e+01 1.40667e-05
>> >>> ifi=5 328 write(41,0x7fffffffb660,328) 3.45511e+01 1.50204e-05
>> >>> ifi=5 40 write(41,0x5d0c4288,40) 3.45511e+01 1.40667e-05
>> >>> ifi=5 544 write(41,0x5d0b7ca8,544) 3.45512e+01 1.50204e-05
>> >>> ifi=5 120 write(41,0x5d0c4918,120) 3.45512e+01 1.40667e-05
>> >>> ifi=5 328 write(41,0x7fffffffb660,328) 3.45512e+01 1.40667e-05
>> >>> ifi=5 272 write(41,0x5d0c8948,272) 3.45512e+01 3.58105e-03
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.45548e+01 1.69277e-05
>> >>> ifi=5 114251304 lseek64(41,114251304,0) 3.45549e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.45549e+01 1.46720e-02
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.45696e+01 1.78814e-05
>> >>> ifi=5 214440776 lseek64(41,214440776,0) 3.45696e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.45696e+01 1.86720e-02
>> >>> ifi=5 314627112 lseek64(41,314627112,0) 3.45883e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.45883e+01 1.42689e-02
>> >>> ifi=5 414813448 lseek64(41,414813448,0) 3.46026e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.46026e+01 1.24190e-02
>> >>> ifi=5 514999784 lseek64(41,514999784,0) 3.46150e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.46150e+01 1.48160e-02
>> >>> ifi=5 615186120 lseek64(41,615186120,0) 3.46299e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.46299e+01 3.93460e-02
>> >>> ifi=5 715372456 lseek64(41,715372456,0) 3.46693e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.46693e+01 1.76220e-02
>> >>> ifi=5 815558792 lseek64(41,815558792,0) 3.46869e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.46869e+01 1.06070e-02
>> >>> ifi=5 915745128 lseek64(41,915745128,0) 3.46975e+01 1.19209e-06
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.46975e+01 1.74150e-02
>> >>> ifi=5 1015931464 lseek64(41,1015931464,0) 3.47150e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47150e+01 1.11501e-02
>> >>> ifi=5 1116117800 lseek64(41,1116117800,0) 3.47262e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47262e+01 1.67122e-02
>> >>> ifi=5 1216304136 lseek64(41,1216304136,0) 3.47429e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47429e+01 5.77402e-03
>> >>> ifi=5 1316490472 lseek64(41,1316490472,0) 3.47487e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47487e+01 1.83940e-02
>> >>> ifi=5 1416676808 lseek64(41,1416676808,0) 3.47671e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47671e+01 1.35159e-02
>> >>> ifi=5 1516863144 lseek64(41,1516863144,0) 3.47806e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47806e+01 1.70491e-02
>> >>> ifi=5 1617049480 lseek64(41,1617049480,0) 3.47977e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.47977e+01 9.32908e-03
>> >>> ifi=5 1717235816 lseek64(41,1717235816,0) 3.48071e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.48071e+01 1.15631e-02
>> >>> ifi=5 1817422152 lseek64(41,1817422152,0) 3.48187e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.48187e+01 8.60000e-03
>> >>> ifi=5 1917608488 lseek64(41,1917608488,0) 3.48273e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.48273e+01 6.62398e-03
>> >>> ifi=5 2017794824 lseek64(41,2017794824,0) 3.48339e+01 1.19209e-06
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.48339e+01 7.51495e-03
>> >>> ifi=5 2117981160 lseek64(41,2117981160,0) 3.48415e+01 0.00000e+00
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.48415e+01 1.77360e-02
>> >>> ifi=5 2218167496 lseek64(41,2218167496,0) 3.48592e+01 9.53674e-07
>> >>> ifi=5 3136 write(41,0x5d0c7948,3136) 3.48592e+01 1.63181e-02
>> >>> ifi=5 2249807432 lseek64(41,2249807432,0) 3.48756e+01 0.00000e+00
>> >>> ifi=5 328 write(41,0x7fffffffb660,328) 3.48756e+01 7.10177e-03
>> >>> ifi=5 0 lseek64(41,0,0) 3.48828e+01 0.00000e+00
>> >>> ifi=5 96 write(41,0x7fffffffb510,96) 3.48828e+01 2.69413e-05
>> >>> ifi=5 0 ftruncate64(41,2249809480) 3.48829e+01 7.08644e-01
>> >>> ifi=5 0 fsync(41) 3.55917e+01 3.28633e-01
>> >>> ifi=5 0 lseek64(41,0,0) 3.59472e+01 1.90735e-06
>> >>> ifi=5 96 write(41,0x7fffffffb4d0,96) 3.59473e+01 5.88894e-05
>> >>> ifi=5 0 close(41) 3.59477e+01 9.05991e-06
>> >>>
>> >>
>> >> <node0-meta-data.png>----------------------------------------------------------------------
>> >> This mailing list is for HDF software users discussion.
>> >> To subscribe to this list, send a message to
>> >> hdf-forum-subscribe at hdfgroup.org.
>> >> To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.
>> >
>> >
>
>
>> ----------------------------------------------------------------------
>> This mailing list is for HDF software users discussion.
>> To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
>> To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tag.png
Type: image/png
Size: 5710 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090323/026b1b61/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: main.c
Type: text/x-csrc
Size: 2319 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090323/026b1b61/attachment.bin>


More information about the Hdf-forum mailing list