use strict; use warnings;

package HTML::Tiny;

use Carp;

=head1 NAME

HTML::Tiny - Lightweight, dependency free HTML/XML generation

=cut

our $VERSION = '1.08';

BEGIN {

  # https://developer.mozilla.org/en-US/docs/Web/HTML/Element
  for my $tag ( qw(
    a abbr acronym address applet area article aside audio
    b base bdi bdo big blink blockquote body br button
    canvas caption center cite code col colgroup
    data datalist dd del details dfn dialog dir div dl dt
    em embed
    fieldset figcaption figure font footer form frame frameset
    h1 h2 h3 h4 h5 h6 head header hgroup hr html
    i iframe img input ins
    kbd keygen
    label legend li link
    main map mark marquee menu menuitem meta meter
    nav nobr noframes noscript
    object ol optgroup option output
    p param picture portal pre progress
    q
    rb rp rt rtc ruby
    s samp script section select slot small source spacer span strike strong style sub summary sup
    table tbody td template textarea tfoot th thead time title tr track tt
    u ul
    var video
    wbr
    xmp
  ) ) {
    no strict 'refs';
    *$tag = sub { shift->auto_tag( $tag, @_ ) };
  }
}

# Tags that are closed (<br /> versus <br></br>)
my @DEFAULT_CLOSED
  # https://developer.mozilla.org/en-US/docs/Glossary/Empty_element
  = qw( area base br col embed hr img input keygen link meta param source track wbr );

# Tags that get a trailing newline
my @DEFAULT_NEWLINE = qw( html head body div p tr table );

my %DEFAULT_AUTO = (
  suffix => '',
  method => 'tag'
);

=head1 SYNOPSIS

  use HTML::Tiny;

  my $h = HTML::Tiny->new;

  # Generate a simple page
  print $h->html(
    [
      $h->head( $h->title( 'Sample page' ) ),
      $h->body(
        [
          $h->h1( { class => 'main' }, 'Sample page' ),
          $h->p( 'Hello, World', { class => 'detail' }, 'Second para' )
        ]
      )
    ]
  );

  # Outputs
  <html>
    <head>
      <title>Sample page</title>
    </head>
    <body>
      <h1 class="main">Sample page</h1>
      <p>Hello, World</p>
      <p class="detail">Second para</p>
    </body>
  </html>

=head1 DESCRIPTION

C<< HTML::Tiny >> is a simple, dependency free module for generating
HTML (and XML). It concentrates on generating syntactically correct
XHTML using a simple Perl notation.

In addition to the HTML generation functions utility functions are
provided to

=over

=item * encode and decode URL encoded strings

=item * entity encode HTML

=item * build query strings

=item * JSON encode data structures

=back

=head1 INTERFACE

=over

=item C<< new >>

Create a new C<< HTML::Tiny >>. The constructor takes one optional
argument: C<< mode >>. C<< mode >> can be either C<< 'xml' >> (default)
or C<< 'html' >>. The difference is that in HTML mode, closed tags will
not be closed with a forward slash; instead, closed tags will be
returned as single open tags.

Example:
  
  # Set HTML mode.
  my $h = HTML::Tiny->new( mode => 'html' );

  # The default is XML mode, but this can also be defined explicitly.
  $h = HTML::Tiny->new( mode => 'xml' );

HTML is a dialect of SGML, and is not XML in any way. "Orphan" open tags
or unclosed tags are legal and in fact expected by user agents. In
practice, if you want to generate XML or XHTML, supply no arguments. If
you want valid HTML, use C<< mode => 'html' >>.

=back

=cut

sub new {
  my $self = bless {}, shift;

  my %params = @_;
  my $mode = $params{'mode'} || 'xml';

  croak "Unknown mode: $mode"
   unless $mode eq 'xml'
     or $mode eq 'html';

  $self->{'_mode'} = $mode;

  $self->_set_auto( 'method', 'closed', @DEFAULT_CLOSED );
  $self->_set_auto( 'suffix', "\n",     @DEFAULT_NEWLINE );
  return $self;
}

sub _set_auto {
  my ( $self, $kind, $value ) = splice @_, 0, 3;
  $self->{autotag}->{$kind}->{$_} = $value for @_;
}

=head2 HTML Generation

=over

=item C<< tag( $name, ... ) >>

Returns HTML (or XML) that encloses each of the arguments in the specified tag. For example

  print $h->tag('p', 'Hello', 'World');

would print

  <p>Hello</p><p>World</p>

notice that each argument is individually wrapped in the specified tag.
To avoid this multiple arguments can be grouped in an anonymous array:

  print $h->tag('p', ['Hello', 'World']);

would print

  <p>HelloWorld</p>

The [ and ] can be thought of as grouping a number of arguments.

Attributes may be supplied by including an anonymous hash in the
argument list:

  print $h->tag('p', { class => 'normal' }, 'Foo');

would print

  <p class="normal">Foo</p>

Attribute values will be HTML entity encoded as necessary.

Multiple hashes may be supplied in which case they will be merged:

  print $h->tag('p',
    { class => 'normal' }, 'Bar',
    { style => 'color: red' }, 'Bang!'
  );

would print

  <p class="normal">Bar</p><p class="normal" style="color: red">Bang!</p>

Notice that the class="normal" attribute is merged with the style
attribute for the second paragraph.

To remove an attribute set its value to undef:

  print $h->tag('p',
    { class => 'normal' }, 'Bar',
    { class => undef }, 'Bang!'
  );

would print

  <p class="normal">Bar</p><p>Bang!</p>

An empty attribute - such as 'checked' in a checkbox can be encoded by
passing an empty array reference:

  print $h->closed( 'input', { type => 'checkbox', checked => [] } );

would print

  <input checked type="checkbox" />

B<Return Value>

In a scalar context C<< tag >> returns a string. In a list context it
returns an array each element of which corresponds to one of the
original arguments:

  my @html = $h->tag('p', 'this', 'that');

would return

  @html = (
    '<p>this</p>',
    '<p>that</p>'
  );

That means that when you nest calls to tag (or the equivalent HTML
aliases - see below) the individual arguments to the inner call will be
tagged separately by each enclosing call. In practice this means that

  print $h->tag('p', $h->tag('b', 'Foo', 'Bar'));

would print

  <p><b>Foo</b></p><p><b>Bar</b></p>

You can modify this behavior by grouping multiple args in an
anonymous array:

  print $h->tag('p', [ $h->tag('b', 'Foo', 'Bar') ] );

would print

  <p><b>Foo</b><b>Bar</b></p>

This behaviour is powerful but can take a little time to master. If you
imagine '[' and ']' preventing the propagation of the 'tag individual
items' behaviour it might help visualise how it works.

Here's an HTML table (using the tag-name convenience methods - see
below) that demonstrates it in more detail:

  print $h->table(
    [
      $h->tr(
        [ $h->th( 'Name', 'Score', 'Position' ) ],
        [ $h->td( 'Therese',  90, 1 ) ],
        [ $h->td( 'Chrissie', 85, 2 ) ],
        [ $h->td( 'Andy',     50, 3 ) ]
      )
    ]
  );

which would print the unformatted version of:

    <table>
        <tr><th>Name</th><th>Score</th><th>Position</th></tr>
        <tr><td>Therese</td><td>90</td><td>1</td></tr>
        <tr><td>Chrissie</td><td>85</td><td>2</td></tr>
        <tr><td>Andy</td><td>50</td><td>3</td></tr>
    </table>

Note how you don't need a td() for every cell or a tr() for every row.
Notice also how the square brackets around the rows prevent tr() from
wrapping each individual cell.

Often when generating nested HTML you will find yourself writing
corresponding nested calls to HTML generation methods. The table
generation code above is an example of this.

If you prefer these nested method calls can be deferred like this:

  print $h->table(
    [
      \'tr',
      [ \'th', 'Name',     'Score', 'Position' ],
      [ \'td', 'Therese',  90,      1 ],
      [ \'td', 'Chrissie', 85,      2 ],
      [ \'td', 'Andy',     50,      3 ]
    ]
  );

In general a nested call like

  $h->method( args )

may be rewritten like this

  [ \'method', args ]

This allows complex HTML to be expressed as a pure data structure. See
the C<stringify> method for more information.

=cut

sub tag {
  my ( $self, $name ) = splice @_, 0, 2;

  my %attr = ();
  my @out  = ();

  for my $a ( @_ ) {
    if ( 'HASH' eq ref $a ) {

      # Merge into attributes
      %attr = ( %attr, %$a );
    }
    else {

      # Generate markup
      push @out,
         $self->_tag( 0, $name, \%attr )
       . $self->stringify( $a )
       . $self->close( $name );
    }
  }

  # Special case: generate an empty tag pair if there's no content
  push @out, $self->_tag( 0, $name, \%attr ) . $self->close( $name )
   unless @out;

  return wantarray ? @out : join '', @out;
}

=item C<< open( $name, ... ) >>

Generate an opening HTML or XML tag. For example:

  print $h->open('marker');

would print

  <marker>

Attributes can be provided in the form of anonymous hashes in the same way as for C<< tag >>. For example:

  print $h->open('marker', { lat => 57.0, lon => -2 });

would print

  <marker lat="57.0" lon="-2">

As for C<< tag >> multiple attribute hash references will be merged. The example above could be written:

  print $h->open('marker', { lat => 57.0 }, { lon => -2 });

=cut

sub open { shift->_tag( 0, @_ ) }

=item C<< close( $name ) >>

Generate a closing HTML or XML tag. For example:

  print $h->close('marker');

would print:

  </marker>

=cut

sub close { "</$_[1]>" }

=item C<< closed( $name, ... ) >>

Generate a closed HTML or XML tag. For example

  print $h->closed('marker');

would print:

  <marker />

As for C<< tag >> and C<< open >> attributes may be provided as hash
references:

  print $h->closed('marker', { lat => 57.0 }, { lon => -2 });

would print:

  <marker lat="57.0" lon="-2" />

=cut

sub closed { shift->_tag( 1, @_ ) }

=item C<< auto_tag( $name, ... ) >>

Calls either C<< tag >> or C<< closed >> based on built in rules
for the tag. Used internally to implement the tag-named methods.

=cut

sub auto_tag {
  my ( $self, $name ) = splice @_, 0, 2;
  my ( $method, $post )
   = map { $self->{autotag}->{$_}->{$name} || $DEFAULT_AUTO{$_} }
   ( 'method', 'suffix' );
  my @out = map { $_ . $post } $self->$method( $name, @_ );
  return wantarray ? @out : join '', @out;
}

=item C<< stringify( $obj ) >>

Called internally to obtain string representations of values.

It also implements the deferred method call notation (mentioned
above) so that

  my $table = $h->table(
    [
      $h->tr(
        [ $h->th( 'Name', 'Score', 'Position' ) ],
        [ $h->td( 'Therese',  90, 1 ) ],
        [ $h->td( 'Chrissie', 85, 2 ) ],
        [ $h->td( 'Andy',     50, 3 ) ]
      )
    ]
  );

may also be written like this:

  my $table = $h->stringify(
    [
      \'table',
      [
        \'tr',
        [ \'th', 'Name',     'Score', 'Position' ],
        [ \'td', 'Therese',  90,      1 ],
        [ \'td', 'Chrissie', 85,      2 ],
        [ \'td', 'Andy',     50,      3 ]
      ]
    ]
  );

Any reference to an array whose first element is a reference to a scalar

  [ \'methodname', args ]

is executed as a call to the named method with the specified args.

=cut

sub stringify {
  my ( $self, $obj ) = @_;
  if ( ref $obj ) {

    # Flatten array refs...
    if ( 'ARRAY' eq ref $obj ) {
      # Check for deferred method call specified as a scalar
      # ref...
      if ( @$obj && 'SCALAR' eq ref $obj->[0] ) {
        my ( $method, @args ) = @$obj;
        return join '', $self->$$method( @args );
      }
      return join '', map { $self->stringify( $_ ) } @$obj;
    }

    # ...stringify objects...
    my $str;
    return $str if eval { $str = $obj->as_string; 1 };
  }

  # ...default stringification
  return "$obj";
}

=back

=head2 Methods named after tags

In addition to the methods described above C<< HTML::Tiny >> provides
all of the following HTML generation methods:

  a abbr acronym address applet area article aside audio b base bdi bdo big
  blink blockquote body br button canvas caption center cite code col colgroup
  data datalist dd del details dfn dialog dir div dl dt em embed fieldset
  figcaption figure font footer form frame frameset h1 h2 h3 h4 h5 h6 head
  header hgroup hr html i iframe img input ins kbd keygen label legend li link
  main map mark marquee menu menuitem meta meter nav nobr noframes noscript
  object ol optgroup option output p param picture portal pre progress q rb rp
  rt rtc ruby s samp script section select slot small source spacer span strike
  strong style sub summary sup table tbody td template textarea tfoot th thead
  time title tr track tt u ul var video wbr xmp

The following methods generate closed XHTML (<br />) tags by default:

  area base br col embed frame hr iframe img input keygen link meta param
  source track wbr

So:

  print $h->br;   # prints <br />
  print $h->input({ name => 'field1' });
                  # prints <input name="field1" />
  print $h->img({ src => 'pic.jpg' });
                  # prints <img src="pic.jpg" />

All other tag methods generate tags to wrap whatever content they
are passed:

  print $h->p('Hello, World');

prints:

  <p>Hello, World</p>

So the following are equivalent:

  print $h->a({ href => 'http://hexten.net' }, 'Hexten');

and

  print $h->tag('a', { href => 'http://hexten.net' }, 'Hexten');

=head2 Utility Methods

=over

=item C<< url_encode( $str ) >>

URL encode a string. Spaces become '+' and non-alphanumeric characters
are encoded as '%' + their hexadecimal character code.

  $h->url_encode( ' <hello> ' )   # returns '+%3chello%3e+'

=cut

sub url_encode {
  my $str = $_[0]->stringify( $_[1] );
  $str
   =~ s/([^A-Za-z0-9_~])/$1 eq ' ' ? '+' : sprintf("%%%02x", ord($1))/eg;
  return $str;
}

=item C<< url_decode( $str ) >>

URL decode a string. Reverses the effect of C<< url_encode >>.

  $h->url_decode( '+%3chello%3e+' )   # returns ' <hello> '

=cut

sub url_decode {
  my $str = $_[1];
  $str =~ s/[+]/ /g;
  $str =~ s/%([0-9a-f]{2})/chr(hex($1))/ieg;
  return $str;
}

=item C<< query_encode( $hash_ref ) >>

Generate a query string from an anonymous hash of key, value pairs:

  print $h->query_encode({ a => 1, b => 2 })

would print

  a=1&b=2

=cut

sub query_encode {
  my $self = shift;
  my $hash = shift || {};
  return join '&', map {
    join( '=', map { $self->url_encode( $_ ) } ( $_, $hash->{$_} ) )
  } sort grep { defined $hash->{$_} } keys %$hash;
}

=item C<< entity_encode( $str ) >>

Encode the characters '<', '>', '&', '\'' and '"' as their HTML entity
equivalents:

  print $h->entity_encode( '<>\'"&' );

would print:

  &lt;&gt;&apos;&quot;&amp;

=cut

{
  my %ENT_MAP = (
    '&'   => '&amp;',
    '<'   => '&lt;',
    '>'   => '&gt;',
    '"'   => '&#34;',    # shorter than &quot;
    "'"   => '&#39;',    # HTML does not define &apos;
    "\xA" => '&#10;',
    "\xD" => '&#13;',
  );

  my $text_special = qr/([<>&'"])/;
  my $attr_special = qr/([<>&'"\x0A\x0D])/;    # FIXME needs tests

  sub entity_encode {
    my $str = $_[0]->stringify( $_[1] );
    my $char_rx = $_[2] ? $attr_special : $text_special;
    $str =~ s/$char_rx/$ENT_MAP{$1}/eg;
    return $str;
  }
}

sub _attr {
  my ( $self, $attr, $val ) = @_;

  if ( ref $val ) {
    return $attr if not $self->_xml_mode;
    $val = $attr;
  }

  my $enc_val = $self->entity_encode( $val, 1 );
  return qq{$attr="$enc_val"};
}

sub _xml_mode { $_[0]->{'_mode'} eq 'xml' }

sub validate_tag {
  # Do nothing. Subclass to throw an error for invalid tags
}

sub _tag {
  my ( $self, $closed, $name ) = splice @_, 0, 3;

  croak "Attributes must be passed as hash references"
   if grep { 'HASH' ne ref $_ } @_;

  # Merge attribute hashes
  my %attr = map { %$_ } @_;

  $self->validate_tag( $closed, $name, \%attr );

  # Generate markup
  my $tag = join( ' ',
    "<$name",
    map { $self->_attr( $_, $attr{$_} ) }
     sort grep { defined $attr{$_} } keys %attr );

  return $tag . ( $closed && $self->_xml_mode ? ' />' : '>' );
}

{
  my @UNPRINTABLE = qw(
   z    x01  x02  x03  x04  x05  x06  a
   x08  t    n    v    f    r    x0e  x0f
   x10  x11  x12  x13  x14  x15  x16  x17
   x18  x19  x1a  e    x1c  x1d  x1e  x1f
  );

  sub _json_encode_ref {
    my ( $self, $seen, $obj ) = @_;
    my $type = ref $obj;
    if ( 'HASH' eq $type ) {
      return '{' . join(
        ',',
        map {
             $self->_json_encode( $seen, $_ ) . ':'
           . $self->_json_encode( $seen, $obj->{$_} )
         } sort keys %$obj
      ) . '}';
    }
    elsif ( 'ARRAY' eq $type ) {
      return
         '['
       . join( ',', map { $self->_json_encode( $seen, $_ ) } @$obj )
       . ']';
    }
    elsif ( UNIVERSAL::can( $obj, 'can' ) && $obj->can( 'TO_JSON' ) ) {
      return $self->_json_encode( $seen, $obj->TO_JSON );
    }
    else {
      croak "Can't json_encode a $type";
    }
  }

  # Minimal JSON encoder. Provided here for completeness - it's useful
  # when generating JS.
  sub _json_encode {
    my ( $self, $seen, $obj ) = @_;

    return 'null' unless defined $obj;

    if ( my $type = ref $obj ) {
      croak "json_encode can't handle self referential structures"
       if $seen->{$obj}++;
      my $rep = $self->_json_encode_ref( $seen, $obj );
      delete $seen->{$obj};
      return $rep;
    }

    return $obj if $obj =~ /^-?\d+(?:[.]\d+)?$/;

    $obj = $self->stringify( $obj );
    $obj =~ s/\\/\\\\/g;
    $obj =~ s/"/\\"/g;
    $obj =~ s/ ( [\x00-\x1f] ) / '\\' . $UNPRINTABLE[ ord($1) ] /gex;

    return qq{"$obj"};
  }
}

=item C<< json_encode >>

Encode a data structure in JSON (Javascript) format:

  print $h->json_encode( { ar => [ 1, 2, 3, { a => 1, b => 2 } ] } );

would print:
    
  {"ar":[1,2,3,{"a":1,"b":2}]}

Because JSON is valid Javascript this method can be useful when
generating ad-hoc Javascript. For example

  my $some_perl_data = {
    score   => 45,
    name    => 'Fred',
    history => [ 32, 37, 41, 45 ]
  };

  # Transfer value to Javascript
  print $h->script( { type => 'text/javascript' },
      "\nvar someVar = " . $h->json_encode( $some_perl_data ) . ";\n " );

  # Prints
  # <script type="text/javascript">
  # var someVar = {"history":[32,37,41,45],"name":"Fred","score":45};
  # </script>

If you attempt to json encode a blessed object C<json_encode> will look
for a C<TO_JSON> method and, if found, use its return value as the
structure to be converted in place of the object. An attempt to encode a
blessed object that does not implement C<TO_JSON> will fail.

=cut

sub json_encode { shift->_json_encode( {}, @_ ) }

1;

__END__

=pod

=encoding UTF-8

=back

=head2 Subclassing

An C<< HTML::Tiny >> is a blessed hash ref.

=over

=item C<< validate_tag( $closed, $name, $attr ) >>

Subclass C<validate_tag> to throw an error or issue a warning when an
attempt is made to generate an invalid tag.

=back

=head1 AUTHOR

Andy Armstrong <andy@hexten.net>

Aristotle Pagaltzis <pagaltzis@gmx.de>

=head1 COPYRIGHT AND LICENSE

This software is copyright (c) 2008 by Andy Armstrong.

This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.

=cut