=encoding utf8

=head1 TITLE

Synopsis 11: Modules

=head1 AUTHORS

    Larry Wall <larry@wall.org>

=head1 VERSION

    Created: 27 Oct 2004

    Last Modified: 25 Oct 2010
    Version: 34

=head1 Overview

This synopsis discusses those portions of Apocalypse 12 that ought to have
been in Apocalypse 11.

=head1 Modules

As in Perl 5, a module is just a kind of package.  Unlike in
Perl 5, modules and classes are declared with separate keywords,
but they're still just packages with extra behaviors.

A module is declared with the C<module> keyword.  There are
two basic declaration syntaxes:

    module Foo; # rest of scope is in module Foo
    ...

    module Bar {...}    # block is in module Bar

A named module declaration can occur as part of an expression, just like
named subroutine declarations.

Since there are no barewords in Perl 6, module names must be predeclared,
or use the sigil-like C<::ModuleName> syntax.  The C<::> prefix does not
imply globalness as it does in Perl 5.  (Use C<GLOBAL::> for that.)

A bare (unscoped) C<module> declarator declares a nested C<our> module
name within the current package.  However, at the start of the file,
the current package is C<GLOBAL>, so the first such declaration in the
file is automatically global.

X<use>
You can use C<our module> to explicitly
declare a module in the current package (or module, or class).
To declare a lexically scoped module, use C<my module>.
Module names are always searched for from innermost scopes to outermost.
As with an initial C<::>, the presence of a C<::> within the name
does not imply globalness (unlike in Perl 5).

The default namespace for the main program is C<GLOBAL>.
(Putting C<module GLOBAL;> at the top of your program
is redundant, except insofar as it tells Perl that the code is Perl
6 code and not Perl 5 code.  But it's better to say "use v6" for that.)

Module traits are set using C<is>:

    module Foo is bar {...}

An anonymous module may be created with either of:

    module {...}
    module :: {...}

The second form is useful if you need to apply a trait:

    module :: is bar {...}

=head1 Exportation

Exportation is now done by trait declaration on the exportable item:

    module Foo;                                # Tagset...
    sub foo is export                   {...}  #  :DEFAULT, :ALL
    sub bar is export(:DEFAULT :others) {...}  #  :DEFAULT, :ALL, :others
    sub baz is export(:MANDATORY)       {...}  #  (always exported)
    sub bop is export(:ALL)             {...}  #  :ALL
    sub qux is export(:others)          {...}  #  :ALL, :others

Declarations marked as C<is export> are bound into the C<EXPORT> inner
modules, with their tagsets as inner module names within it.  For example,
the C<sub bar> above will bind as C<&Foo::EXPORT::DEFAULT::bar>,
C<&Foo::EXPORT::ALL::bar>, and C<&Foo::EXPORT::others::bar>.

Tagset names consisting entirely of capitals are reserved for Perl.

Inner modules automatically add their export list to modules in all their
outer scopes:

    module Foo {
        sub foo is export {...}
        module Bar {
            sub bar is export {...}
            module Baz {
                sub baz is export {...}
            }
        }
    }

The C<Foo> module will export C<&foo>, C<&bar> and C<&baz> by default;
calling C<Foo::Bar::.EXPORTALL> will export C<&bar> and C<&baz> at runtime
to the caller's package.

Any C<proto> declaration that is not declared C<my> is exported by default.
Any C<multi> that depends on an exported C<proto> is also automatically exported.
Any autogenerated C<proto> is assumed to be exported by default.

=head1 Dynamic exportation

The default C<EXPORTALL> handles symbol exports by removing recognized
export items and tagsets from the argument list, then calls the C<EXPORT>
subroutine in that module (if there is one), passing in the remaining
arguments.

If the exporting module is actually a class, C<EXPORTALL> will invoke its
C<EXPORT> method with the class itself as the invocant.

=head1 Compile-time Importation
X<use>

[Note: the :MY forms are being rethought currently.]

Importing via C<use> binds into the current lexical scope by default
(rather than the current package, as in Perl 5).

    use Sense <common @horse>;

You can be explicit about the desired namespace:

    use Sense :MY<common> :OUR<@horse>;

That's pretty much equivalent to:

    use Sense;
    my &common ::= Sense::<&common>;
    our @horse ::= Sense::<@horse>;
    $*sensitive ::= Sense::<$sensitive>

It is also possible to re-export the imported symbols:

    use Sense :EXPORT;                  # import and re-export the defaults
    use Sense <common> :EXPORT;         # import "common" and re-export it
    use Sense <common> :EXPORT<@horse>; # import "common" but export "@horse"

In the absence of a specific scoping specified by the caller, the module
may also specify a different scoping default by use of C<:MY> or C<:OUR>
tags as arguments to C<is export>.  (Of course, mixing incompatible scoping
in different scopes is likely to lead to confusion.)

The C<use> declaration is actually a composite of two other declarations,
C<need> and C<import>.  Saying

    use Sense <common @horse>;

breaks down into:

    need Sense;
    import Sense <common @horse>;

These further break down into:

    BEGIN {
      my $target ::= OUTER;
      for <Sense> {
        my $scope = load_module(find_module_defining($_));
        # install the name of the type
        $target.install_alias($_, $scope{$_}) if $scope.exists{$_};
        # get the package declared by the name in that scope,
        my $package_name = $_ ~ '::';
        # if there isn't any, then there's just the type...
        my $loaded_package = $scope{$package_name} or next;
        # get a copy of the package, to avoid action-at-a-distance
        # install it in the target scope
        $target{$package_name} := $loaded_package.copy;
        # finally give the chance for the module to install
        # the selected symbols
        $loaded_package.EXPORTALL($target, <common @horse>);
      }
    }

=head2 Loading without importing
X<need>

The C<need> declarator takes a list of modules and loads them (at
compile time) without importing any symbols.  It's good for loading
class modules that have nothing to export (or nothing that you want
to import):

    need ACME::Rocket;
    my $r = ACME::Rocket.new;

This declaration is equivalent to Perl 5's:

    use ACME::Rocket ();

Saying

    need A,B,C;

is equivalent to:

    BEGIN {
      my $target ::= OUTER;
      for <A B C> {
        my $scope = load_module(find_module_defining($_));
        # install the name of the type
        $target.install_alias($_, $scope{$_}) if $scope.exists{$_};
        # get the package declared by the name in that scope,
        my $package_name = $_ ~ '::';
        # if there isn't any, then there's just the type...
        my $loaded_package = $scope{$package_name} or next;
        # get a copy of the package, to avoid action-at-a-distance
        # install it in the target scope
        $target{$package_name} := $loaded_package.copy;
      }
    }

=head2 Importing without loading
X<import>

The importation into your lexical scope may also be a separate declaration
from loading.  This is primarily useful for modules declared inline, which
do not automatically get imported into their surrounding scope:

    my module Factorial {
        sub fact (Int $n) is export { [*] 1..$n }
    }
    ...
    import Factorial 'fact';   # imports the multi

The last declaration is syntactic sugar for:

    BEGIN Factorial.WHO.EXPORTALL(MY, 'fact');

This form functions as a compile-time declarator, so that these
notations can be combined by putting a declarator in parentheses:

    import (role Silly {
        enum Ness is export <Dilly String Putty>;
    }) <Ness>;

This really means:

    BEGIN (role Silly {
        enum Ness is export <Dilly String Putty>;
    }).WHO.EXPORTALL(MY, <Ness>)


Without an import list, C<import> imports the C<:DEFAULT> imports.

=head1 Runtime Importation

Importing via C<require> also installs names into the current lexical scope by
default, but delays the actual binding till runtime:

    require Sense <common @horse>;

This means something like:

    BEGIN MY.declare_stub_symbols('Sense', <common @horse>);
    # run time!
    MY.import_realias(:from(load_module(find_module_defining('Sense'))), 'Sense');
    MY.import_realias(:from(Sense), <common @horse>);

(The C<.import_realias> requires that the symbols to be imported already
exist; this differs from C<.import_alias>, which requires that the
imported symbols I<not> already exist in the target scope.)

Alternately, a filename may be mentioned directly, which installs a
package that is effectively anonymous to the current lexical scope,
and may only be accessed by whatever global names the module installs:

    require "/home/non/Sense.pm" <common @horse>;

which breaks down to:

    BEGIN MY.declare_stub_symbols(<common @horse>);
    MY.import_realias(:from(load_module("/home/non/Sense.pm")), <common @horse>);

Only explicitly mentioned names may be so imported.  In order
to protect the run-time sanctity of the lexical pad, it may not be
modified by C<require>.  Tagsets are assumed to be unknown at compile
time, hence tagsets are not allowed in the default import list to
C<:MY>, but you can explicitly request to put names into the C<:OUR>
scope, since that is modifiable at run time:

    require Sense <:ALL>    # does not work
    require Sense :MY<ALL>  # this doesn't work either
    require Sense :OUR<ALL> # but this works

If the import list is omitted, then nothing is imported.  Since you
may not modify the lexical pad, calling an importation routine at
runtime cannot import into the lexical scope, and defaults to importation
to the package scope instead:

    require Sense;
    Sense.EXPORTALL;   # goes to the OUR scope by default, not MY

(Such a routine I<may> rebind existing lexicals, however.)

=head1 Importing from a pseudo-package

You may also import symbols from the various pseudo-packages listed in S02.
They behave as if all their symbols are in the C<:ALL> export list:

    import PROCESS <$IN $OUT $ERR>;
    import CALLER <$x $y>;

    # Same as:
    #     my ($IN, $OUT, $ERR) := PROCESS::<$IN $OUT $ERR>
    #     my ($x, $y) := ($CALLER::x, $CALLER::y)

[Conjecture: this section may go away, since the aliasing forms
are not all that terrible, and it's not clear that we want the
overhead of emulating export lists.]

=head1 Versioning

When at the top of a file you say something like

    module Squirrel;

or

    class Dog;

you're really only giving one part of the name of the module.
The full name of the module or class includes other metadata,
in particular, the author, and the version.

Modules posted to CPAN or entered into any standard Perl 6 library
are required to declare their full name so that installations can know
where to keep them, such that multiple versions by different authors
can coexist, all of them available to any installed version of Perl.
(When we say "modules" here we don't mean only modules declared with
the C<module> declarator, but also classes, roles, grammars, etc.)

Such modules are also required to specify exactly which version (or
versions) of Perl they are expecting to run under, so that future
versions of Perl can emulate older versions of Perl (or give a cogent
explanation of why they cannot).  This will allow the language to
evolve without breaking existing widely used modules.  (Perl 5 library
policy is notably lacking here; it would induce massive breakage even
to change Perl 5 to make strictness the default.)  If a CPAN module
breaks because it declares that it supports future versions of Perl
when it doesn't, then it must be construed to be the module's fault,
not Perl's.  If Perl evolves in a way that does not support emulation
of an older version (at least, back to 6.0.0), then it's Perl's fault
(unless the change is required for security, in which case it's the
fault of the insensitive clod who broke security :).

The internal API for package names is always case-sensitive, even if
the library system is hosted on a system that is not case-sensitive.
Likewise internal names are Unicode-aware, even if the filesystem isn't.
This implies either some sort of name mangling capability or storage
of intermediate products into a database of some sort.  In any event,
the actual storage location must be encapsulated in the library system
such that it is hidden from all language level naming constructs.
(Provision must be made for interrogating the library system for
the actual location of a module, of course, but this falls into
the category of introspection.)  Note also that distributions
need to be distributed in a way that they can be installed on
case-insensitive systems without loss of information.  That's fine,
but the language-level abstraction must not leak details of this
mechanism without the user asking for the details to be leaked.

The syntax of a versioned module or class declaration has multiple
parts in which the non-identifier parts are specified in adverbial pair
notation without intervening spaces.  Internally these are stored in
a canonical string form which you should ignore.  You may write the
various parts in any order, except that the bare identifier must come
first.  The required parts for library insertion are the short name of the
class/module, a URI identifying the author (or authorizing authority, so we
call it "auth" to be intentionally ambiguous), and its version number.
For example:

    class Dog:auth<cpan:JRANDOM>:ver<1.2.1>;
    class Dog:auth<http://www.some.com/~jrandom>:ver<1.2.1>;
    class Dog:auth<mailto:jrandom@some.com>:ver<1.2.1>;

Since these are somewhat unwieldy to look at, we allow a shorthand in
which a bare subscripty adverb interprets its elements according to their
form:

    class Dog:<cpan:JRANDOM 1.2.1>

The pieces are interpreted as follows:

=over

=item *

Anything matching C<< [<ident> '::']* <ident> >> is treated as a
package name

=item *

Anything matching C<< <alpha>+ \: \S+ >> is treated as an author(ity)

=item *

Anything matching C<< v? [\d+ '.']* \d+ >> is treated as a version number

=back

These declarations automatically alias the full name of the class
(or module) to the short name.  So for the rest of the lexical scope,
C<Dog> refers to the longer name.  The real library name can be
specified separately as another adverb, in which case the identifier
indicates only the alias within the current lexical scope:

    class Pooch:name<Dog>:auth<cpan:JRANDOM>:ver<1.2.1>

or

    class Pooch:<Dog cpan:JRANDOM 1.2.1>

for short.

Here the real name of the module starts C<Dog>, but we refer to it
as C<Pooch> for the rest of this file.  Aliasing is handy if you need to
interface to more than one module named C<Dog>

If there are extra classes or modules or packages declared within
the same file, they implicitly have a long name including the file's
version and author, but you needn't declare them again.

Since these long names are the actual names of the classes as far as
the library system is concerned, when you say:

    use Dog;

you're really wildcarding the unspecified bits:

    use Dog:auth(Any):ver(Any);

And when you say:

    use Dog:<1.2.1>;

you're really asking for:

    use Dog:auth(Any):ver<1.2.1>;

Saying C<1.2.1> specifies an I<exact> match on that part of the
version number, not a minimum match.  To match more than one version,
put a range operator as a selector in parens:

    use Dog:ver(v1.2.1..v1.2.3);
    use Dog:ver(v1.2.1..^v1.3);
    use Dog:ver(v1.2.1..*);

When specifying the version of your own module, C<1.2> is equivalent
to C<1.2.0>, C<1.2.0.0>, and so on.  However C<use> searches for
modules matching a version prefix, so the subversions are wildcarded,
and in this context C<< :ver<1.2> >> really means C<< :ver<1.2.*> >>.
If you say:

    use v6;

which is short for:

    use Perl:ver<6.*>;

you're asking for any version of Perl 6.  You need to say something like

    use Perl:<6.0>;
    use Perl:<6.0.0>;
    use Perl:<6.2.7.1>;

if you want to lock in a particular set of semantics at some greater
degree of specificity.  And if some large company ever forks Perl, you can say
something like:

    use Perl:auth<cpan:TPF>

to guarantee that you get the unembraced Perl.  C<:-)>

When it happens that the same module is available from more than one
authority, and the desired authority is not specified by the C<use>,
the version lineage that was created first wins, unless overridden by
local policy or by official abandonment by the original authority (as
determined either by the author or by community consensus in case the
author is no longer available or widely regarded as uncooperative).
An officially abandoned lineage will be selected only if it is the
only available lineage of locally installed modules.

Once the authority is selected, then and only then is any version
selection done; the version specification is ignored until the
authority is selected.  This implies that all official modules record
permanently when they were first installed in the official library,
and this creation date is considered immutable.

For wildcards any valid smartmatch selector works:

    use Dog:auth(/:i jrandom/):ver(v1.2.1 | v1.3.4);
    use Dog:auth({ .substr(0,5) eq 'cpan:'}):ver(Any);

In any event, however you select the module, its full name is
automatically aliased to the short name for the rest of your lexical
scope.  So you can just say

    my Dog $spot .= new("woof");

and it knows (even if you don't) that you mean

    my Dog:<cpan:JRANDOM 1.3.4> $spot .= new("woof");

The C<use> statement allows an external language to be specified in
addition to (or instead of) an authority, so that you can use modules
from other languages.  The C<from> adverb also parses any additional
parts as short-form arguments.  For instance:

    use Whiteness:from<perl5>:name<Acme::Bleach>:auth<cpan:DCONWAY>:ver<1.12>;
    use Whiteness:from<perl5 Acme::Bleach cpan:DCONWAY 1.12>;  # same thing

The string form of a version recognizes the C<*> wildcard in place of any
position.  It also recognizes a trailing C<+>, so

    :ver<6.2.3+>

is short for

    :ver(v6.2.3 .. v6.2.*)

And saying

    :ver<6.2.0+>

specifically rules out any prereleases.

If two different modules in your program require two different
versions of the same module, Perl will simply load both versions at
the same time.  For modules that do not manage exclusive resources,
the only penalty for this is memory, and the disk space in the library
to hold both the old and new versions.  For modules that do manage
an exclusive resource, such as a database handle, there are two approaches
short of requiring the user to upgrade.  The first is simply to refactor
the module into a stable supplier of the exclusive resource that doesn't
change version often, and then the outer wrappers of that resource can
both be loaded and use the same supplier of the resource.

The other approach is for the module to keep the management of its exclusive
resource, but offer to emulate older versions of the API.  Then if there
is a conflict over which version to use, the new one is used by both users,
but each gets a view that is consistent with the version it thinks it is
using.  Of course, this depends crucially on how well the new version
actually emulates the old version.

To declare that a module emulates an older version, declare it like this:

    class Dog:<cpan:JRANDOM 1.2.1> emulates :<1.2.0>;

Or to simply exclude use of the older module and (presumably) force
the user to upgrade:

    class Dog:<cpan:JRANDOM 1.2.1> excludes :<1.2.0>;

The name is parsed like a C<use> wildcard, and you can have more than one,
so you can say things like:

    class Dog:<cpan:JRANDOM 1.2.1>
        emulates Dog:auth(DCONWAY|JCONWAY|TCONWAY):ver<1.0+>
        excludes Fox:<http://oreillymedia.com 3.14159>
        emulates Wolf:from<C# 0.8..^1.0>;

=head1 Forcing Perl 6

To get Perl 6 parsing rather than the default Perl 5 parsing,
we said you could force Perl 6 mode in your main program with:

    use v6;

Actually, if you're running a parser that is aware of Perl 6, you
can just start your main program with any of:

    use v6;
    module;
    class;

Those all specify the latest Perl 6 semantics, and are equivalent to

    use Perl:auth(Any):ver(v6..*);

To lock the semantics to 6.0.0, say one of:

    use Perl:ver<6.0.0>;
    use :<6.0.0>;
    use v6.0.0;

In any of those cases, strictures and warnings are the default
in your main program.  But if you start your program with a bare
version number or other literal:

    v6.0.0;
    v6;
    6;
    "Coolness, dude!";

it runs Perl 6 in "lax" mode, without strictures or warnings, since obviously
a bare literal in a sink (void) context I<ought> to have produced a "Useless use of..." warning.
(Invoking perl with C<-e '6;'> has the same effect.)

In the other direction, to inline Perl 5 code inside a Perl 6 program, put
C<use v5> at the beginning of a lexical block.  Such blocks can nest arbitrarily
deeply to switch between Perl versions:

    use v6;
    # ...some Perl 6 code...
    {
        use v5;
        # ...some Perl 5 code...
        {
            use v6;
            # ...more Perl 6 code...
        }
    }

It's not necessary to force Perl 6 if the interpreter or command
specified already implies it, such as use of a "C<#!/usr/bin/perl6>"
shebang line.  Nor is it necessary to force Perl 6 in any file that
begins with the "class" or "module" keywords.

=head1 Tool use vs language changes

In order that language processing tools know exactly what language
they are parsing, it is necessary for the tool to know exactly which
variant of Perl 6 is being parsed in any given scope.  All Perl 6
compilation units that are complete files start out at the top of the
file in the Standard Dialect (which itself has versions that correspond
to the same version of the official Perl test suite).  Eval strings,
on the other hand, start out in the language variant in use at the
point of the eval call, so that you don't suddenly lose your macro
definitions inside eval.

All language tweaks from the start of the compilation unit must
be tracked.  Tweaks can be specified either directly in your code as
macros and such, or such definitions may be imported from a module.
As the compiler progresses through the compilation unit, other grammars
may be substituted in an inner lexical scope for an outer grammar,
and parsing continues under the new grammar (which may or may not be
a derivative of the standard Perl grammar).

Language tweaks are considered part of the interface of any module
you import.  Version numbers are assumed to represent a combination of
interface and patch level.  We will use the term "interface version"
to represent that part of the version number that represents the
interface.  For typical version number schemes, this is the first two
numbers (where the third number usually represents patch level within
a constant interface).  Other schemes are possible though.  (It is
recommended that branches be reflected by differences in authority
rather than differences in version, whenever that makes sense.  To make
it make sense more often, some hierarchical authority-naming scheme
may be devised so that authorities can have temporary subauthorities
to hold branches without relinquishing overall naming authority.)

So anyway, the basic rule is this: you may import language tweaks
from your own private (user-library) code as you like; however, all
imports of language tweaks from the official library must specify
the exact interface version of the module.

Such officially installed interface versions must be considered
immutable on the language level, so that once any language-tweaking
module is in circulation, it may be presumed to represent a fixed
language change.  By examination of these interface versions a language
processing tool can know whether it has sufficient information to
know the current language.

In the absence of that information, the tool can choose either
to download and use the module directly, or the tool can proceed
in ignorance.  As an intermediate position, if the tool does not
actually care about running the code, the tool need not actually have
the complete module in question; many language tweaks could be stored
in a database of interface versions, so if the tool merely knows the
nature of the language tweak on the basis of the interface version it
may well be able to proceed with perfect knowledge.  A module that
uses a well-behaved macro or two could be fairly easily emulated
based on the version info alone.

But more realistically, in the absence of such a hypothetical database,
most systems already come with a kind of database for modules that
have already been installed.  So perhaps the most common case is
that you have downloaded an older version of the same module, in
which case the tool can know from the interface version whether that
older module represesents the language tweak sufficiently well that
your tool can use the interface definition from that module without
bothering to download the latest patch.

Note that most class modules do no language tweaking, and in any case
cannot perform language tweaks unless these are explicitly exported.

Modules that exported C<multi>s are technically language tweaks on the
semantic level, but as long as those new definitions modify semantics
within the existing grammar (by avoiding the definition of new macros
or operators), they do not fall into the language tweak category.
Modules that export new operators or macros are always considered
language tweaks.  (Unexported macros or operators intended only for
internal use of the module itself do not count as language tweaks.)

The requirement for immutable interfaces extends transitively to
any modules imported by a language tweak module.  There can be no
indeterminacy in the language definition either directly or indirectly.

It must be possible for any official module to be separately compiled
without knowledge of the lexical or dynamic context in which it will be embedded, and
this separate compilation must be able to produce a deterministic
profile of the interface.  It must be possible to extract out the
language tweaking part of this profile for use in tools that wish to
know how to parse the current language variant deterministically.


=for vim:set expandtab sw=4: