[hdf-forum] Unicode filenames on Windows?

Quincey Koziol koziol at hdfgroup.org
Wed Jun 3 16:20:47 EDT 2009


Hi Andrew,

On Jun 3, 2009, at 1:10 PM, Andrew Collette wrote:

> Hi Francesc,
>
>> I don't know for sure how to do that in pure C, but if you are  
>> using Python
>> (and I think that's the case), you can encode the file name using the
>> underlying filesystem encoding.  The next function:
>>
>> def encode_filename(filename):
>>  """Return the encoded filename in the filesystem encoding."""
>>  if type(filename) is unicode:
>>    encoding = sys.getfilesystemencoding()
>>    encname = filename.encode(encoding)
>>  else:
>>    encname = filename
>>  return encname
>>
>> works well on every filesystem that I've tested (including NTFS).
>
> Yes, this is how my Unicode handling works at the moment; it seems
> fine on UNIX (UTF-8 encoding) and with common characters on Windows.
> However, trying to encode certain unicode characters doesn't work on
> Windows; for example, u'\u1201'.  It seems that "mbcs" can only encode
> characters in the current code page.
>
> Unfortunately it looks like Windows does Unicode with a separate,
> wide-character API, so I may be out of luck.  It would be nice if HDF5
> simply took UTF-8 everywhere and called the appropriate low-level API.
> :)

	Hmm, I don't think we do anything special to the strings we pass to  
the file system.  Is there some particular problem you are seeing?

	Quincey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2502 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090603/3a190b56/attachment.bin>


More information about the Hdf-forum mailing list