Audio::TagLib::String - A wide string class suitable for unicode


  use Audio::TagLib::String;
  my $i = Audio::TagLib::String->new("blah blah blah");
  print $i->toCString(), "\n"; # got "blah blah blah"


This is an implicitly shared wide string. For storage it uses Audio::TagLib::wstring, but as this is an implementation detail this of course could change. Strings are stored internally as UTF-16BE. (Without the BOM (Byte Order Mark))

The use of implicit sharing means that copying a string is cheap, the only cost comes into play when the copy is modified. Prior to that the string just has a pointer to the data of the parent String. This also makes this class suitable as a function return type.

In addition to adding implicit sharing, this class keeps track of four possible encodings, which are the four supported by the ID3v2 standard.

  • %_Type

    The four types of string encodings supported by the ID3v2 specification. ID3v1 is assumed to be Latin1 and Ogg Vorbis comments use UTF8.

    qw(Latin1 UTF16 UTF16BE UTF16LE UTF8)

    keys %Audio::TagLib::String::_Type lists all available values also.

    NOTE binmode STDOUT, ":utf8" to display UTF8 string.

  • new()

    Constructs an empty String.

  • new(String $s)

     Make a shallow, implicitly shared, copy of $s. Because this is
     implicitly shared, this method is lightweight and suitable for
     pass-by-value usage.
  • new(ByteVector $v, PV $t = "Latin1")

    Makes a deep copy of the data in $v.

    NOTE This should only be used with the 8-bit codecs Latin1 and UTF8, when used with other codecs it will simply print a warning and exit.

  • new(PV $data, PV $encode)

    Constructs a String from the data $data encoded by $encode.

  • new(PV $data)

    Constructs a String from the data $data.

    NOTE $data should be the internal format of Perl. It will check the UTF8 to determine the encode to use(Latin1 or UTF8 in this case).


    Destroys this String instance.

  • PV to8Bit(BOOL $unicode = FALSE)

    If $unicode is false (the default) this will return a Latin1 encoded string. If it is true the returned string will be UTF-8 encoded and UTF8 flag on.

  • PV toCString(BOOL $unicode = FALSE)

    see to8Bit()

    WARNING Differ from C/C++, the PV will contain a copy of the string returned by C/C++ code.

  • Iterator begin()

    Returns an iterator pointing to the beginning of the string.

  • Iterator end()

    Returns an iterator pointing to the end of the string (the position after the last character).

  • IV find(String $s, IV $offset = 0)

    Finds the first occurance of pattern $s in this string starting from $offset. If the pattern is not found, -1 is returned.

  • String substr(UV $position, UV $n = 0xffffffff)

    Extract a substring from this string starting at $position and continuing for $n characters.

  • String apppend(String $s)

    Append $s to the current string and return a reference to the current string.

  • String uppper()

    Returns an upper case version of the string.

    WARNING This only works for the characters in US-ASCII, i.e. A-Z.

  • UV size()

    Returns the size of the string.

  • BOOL isEmpty()

    Returns true if the string is empty.

    see isNull()

  • BOOL isNull()

    Returns true if this string is null -- i.e. it is a copy of the String::null string.

    NOTE A string can be empty and not null.

    see isEmpty()

  • ByteVector data(PV $type)

    Returns a ByteVector containing the string's data. If $type is Latin1 or UTF8, this will return a vector of 8 bit characters, otherwise it will use 16 bit characters.

  • IV toInt()

    Convert the string to an integer.

  • String stripWhiteSpace()

    Returns a string with the leading and trailing whitespace stripped.

  • String number(IV $n) [static]

    Converts the base-10 integer $n to a string.

  • PV getChar(IV $i)

    Returns the character at position $i. Encodes by UTF8 and sets UTF8 on if the returned character is a wide character.

  • String copy(String $s)

    Performs a shallow, implicitly shared, copy of $s, overwriting the String's current data.

  • String copy(ByteVector $v)

    Performs a deep copy of the data in $d.

  • String copy(PV $data)

    Copies $data into current String. Check UTF8 flag to determine the encode to use (Latin1 or UTF8).

  • String null() [static]

    Returns a static null string provided for convenience.


None by default.


== != += < >




Dongxu Ma, <>


Copyright (C) 2005 by Dongxu Ma

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available.

27 POD Errors

The following errors were encountered while parsing the POD:

Around line 66:

Expected '=item *'

Around line 79:

Expected '=item *'

Around line 83:

Expected '=item *'

Around line 89:

Expected '=item *'

Around line 97:

Expected '=item *'

Around line 101:

Expected '=item *'

Around line 108:

Expected '=item *'

Around line 112:

Expected '=item *'

Around line 118:

Expected '=item *'

Around line 125:

Expected '=item *'

Around line 129:

Expected '=item *'

Around line 134:

Expected '=item *'

Around line 139:

Expected '=item *'

Around line 144:

Expected '=item *'

Around line 149:

Expected '=item *'

Around line 155:

Expected '=item *'

Around line 159:

Expected '=item *'

Around line 165:

Expected '=item *'

Around line 174:

Expected '=item *'

Around line 180:

Expected '=item *'

Around line 184:

Expected '=item *'

Around line 188:

Expected '=item *'

Around line 192:

Expected '=item *'

Around line 197:

Expected '=item *'

Around line 202:

Expected '=item *'

Around line 206:

Expected '=item *'

Around line 211:

Expected '=item *'