[hdf-forum] Unicode filenames on Windows?

Andrew Collette andrew.collette at gmail.com
Wed Jun 3 14:10:52 EDT 2009


Hi Francesc,

> I don't know for sure how to do that in pure C, but if you are using Python
> (and I think that's the case), you can encode the file name using the
> underlying filesystem encoding.  The next function:
>
> def encode_filename(filename):
>  """Return the encoded filename in the filesystem encoding."""
>  if type(filename) is unicode:
>    encoding = sys.getfilesystemencoding()
>    encname = filename.encode(encoding)
>  else:
>    encname = filename
>  return encname
>
> works well on every filesystem that I've tested (including NTFS).

Yes, this is how my Unicode handling works at the moment; it seems
fine on UNIX (UTF-8 encoding) and with common characters on Windows.
However, trying to encode certain unicode characters doesn't work on
Windows; for example, u'\u1201'.  It seems that "mbcs" can only encode
characters in the current code page.

Unfortunately it looks like Windows does Unicode with a separate,
wide-character API, so I may be out of luck.  It would be nice if HDF5
simply took UTF-8 everywhere and called the appropriate low-level API.
:)

Andrew

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.





More information about the Hdf-forum mailing list