[hdf-forum] Unicode filenames on Windows?
Quincey Koziol
koziol at hdfgroup.org
Wed Jun 3 16:20:47 EDT 2009
Hi Andrew,
On Jun 3, 2009, at 1:10 PM, Andrew Collette wrote:
> Hi Francesc,
>
>> I don't know for sure how to do that in pure C, but if you are
>> using Python
>> (and I think that's the case), you can encode the file name using the
>> underlying filesystem encoding. The next function:
>>
>> def encode_filename(filename):
>> """Return the encoded filename in the filesystem encoding."""
>> if type(filename) is unicode:
>> encoding = sys.getfilesystemencoding()
>> encname = filename.encode(encoding)
>> else:
>> encname = filename
>> return encname
>>
>> works well on every filesystem that I've tested (including NTFS).
>
> Yes, this is how my Unicode handling works at the moment; it seems
> fine on UNIX (UTF-8 encoding) and with common characters on Windows.
> However, trying to encode certain unicode characters doesn't work on
> Windows; for example, u'\u1201'. It seems that "mbcs" can only encode
> characters in the current code page.
>
> Unfortunately it looks like Windows does Unicode with a separate,
> wide-character API, so I may be out of luck. It would be nice if HDF5
> simply took UTF-8 everywhere and called the appropriate low-level API.
> :)
Hmm, I don't think we do anything special to the strings we pass to
the file system. Is there some particular problem you are seeing?
Quincey
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2502 bytes
Desc: not available
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20090603/3a190b56/attachment.bin>
More information about the Hdf-forum
mailing list