[hdf-forum] hdf5 import benchmark
Francesc Alted
faltet at pytables.org
Tue Jun 30 07:41:51 EDT 2009
Hi Mag,
A Tuesday 30 June 2009 01:38:42 Mag Gam escrigué:
> Does anyone have hdf5 data import benchmarks? I want to know what is a
> reasonable load time for a particular dataset. For example, if you
> have a 5million tab seperated file, and you import the data into hdf5
> how long would/should it take? What factors effect the load time?
> Assume there is no i/o bottle neck?
>
> I want to see how fast my approach is: I am able to load 1000 rows in
> 7 seconds. How does that compare?
That's pretty slow. With the attached Python script, I'm able to import a 5
million rows CSV file into an HDF5 file in around 16s, which is pretty good.
The script is written for PyTables, but can be migrated to h5py if desired.
HTH,
--
Francesc Alted
-------------- next part --------------
A non-text attachment was scrubbed...
Name: import-csv.py
Type: text/x-python
Size: 1222 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090630/207a2bcd/attachment.py>
-------------- next part --------------
----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.
More information about the Hdf-forum
mailing list