NAME

Encode::UTF8Mac - "utf-8-mac" encoding, a variant utf-8 used by Mac OSX

SYNOPSIS

  use Encode;
  use Encode::UTF8Mac;
  
  my $filename = Encode::encode('utf-8-mac', "\x{3054}\x{FA19}\x{4F53}");
  # => \xE3\x81\x93\xE3\x82\x99\xEF\xA8\x99\xE4\xBD\x93
  # note:
  # Unicode utf-8(hex)    NFD()            MacOS
  # U+3054  \xE3\x81\x94  U+3053 + U+3099  decompose
  # U+3053  \xE3\x81\x93  (no-op)
  # U+3099  \xE3\x82\x99  (no-op)
  # U+FA19  \xEF\xA8\x99  U+795E           not decompose
  # U+4F53  \xE4\xBD\x93  (no-op)
  
  $filename = Encode::decode('utf-8-mac', $filename);
  # => \x{3054}\x{FA19}\x{4F53}

DESCRIPTION

Encode::UTF8Mac provides a encoding called "utf-8-mac" used in Mac OSX.

On Mac OSX, utf-8 encoding is used and it is normalized form D (characters are decomposed). However, not follow the exact specification.

http://developer.apple.com/library/mac/#qa/qa2001/qa1173.html

Specifically, the following ranges are not decomposed.

  U+2000-U+2FFF
  U+F900-U+FAFF
  U+2F800-U+2FAFF

In iconv (bundled Mac), this encoding can be using as "utf-8-mac".

This module adds "utf-8-mac" encoding for Encode, it encode/decode text with that rule in mind. This will help when you decode file name on Mac.

ENCODING

utf-8-mac
  • Encode::decode('utf-8-mac', $bytes)

    Decode as utf-8, and normalize form C except special range using Unicode::Normalize.

  • Encode::encode('utf-8-mac', $unicode)

    Normalize form D except special range using Unicode::Normalize, and encode as utf-8.

SEE ALSO

Encode, Encode::Locale, Unicode::Normalize

AUTHOR

Naoki Tomita <tomita@cpan.org>

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.