Net::PublicSuffixList - The Mozilla Public Suffix List
use Net::PublicSuffixList; my $psl = Net::PublicSuffixList->new; my $host = 'amazon.co.uk'; # get all the suffixes in host (like, uk and co.uk) my $suffixes = $psl->suffixes_in( $host ); # get the longest suffix my $suffix = $psl->longest_suffix_in( $host ); my $hash = $psl->split_host( $host );
I mostly wrote this because I was working on App::url and needed a way to figure out which part of a URL was the registered part and with was the top-level domain.
The Public Suffix List is essentially a self-reported collection of the top-level, generic, country code, or whatever domains.
There are other modules that try to do this, but they come with packaged (old) versions of the Public Suffix List or have limited functionality.
This module can fetch the most current one for you, use one that you provide locally, or even let you completely make it up. You can add entries you want but don't show up in the list, and remove ones you don't think should be there.
Create the new object and specify how you'd like to get the data. The network file is about 220Kb, so you might want to fetch it once, store it, and then use
local_pathto use it.
The constructor first tries to use a local file. If you've disabled that with
no_localor the file doesn't exist, it moves on to trying the network. If you've disabled the network with
no_net, then it complains but still returns the object. You can still construct your own list with
list_url # the URL for the suffix list local_path # the path to a local file that has the suffix list no_net # do not use the network no_local # do not use a local file cache_dir # location to save the fetched file
A hash of the default values for everything.
- parse_list( STRING_REF )
Take a scalar reference to the contents of the public suffix list, find all the suffices and add them to the object.
- add_suffix( STRING )
Add STRING to the known public suffices. This returns the object itself.
Before this adds the suffix, it strips off leading
.*characters. Some sources specify
*.foo.bar, but this adds
- remove_suffix( STRING )
Remove the STRING as a known public suffices. This returns the object itself.
- suffix_exists( STRING )
Return the invocant if the suffix exists, and the empty list otherwise.
- suffixes_in( HOST )
Return an array reference of the publix suffixes in HOST, sorted from shortest to longest.
- longest_suffix_in( HOST )
Return the longest public suffix in HOST.
- split_host( HOST )
Returns a hash reference with these keys:
host the input value suffix the longest public suffix short the input value with the public suffix (and leading dot) removed
Fetch the public suffix list plaintext file from the path returned by
local_path. Returns a scalar reference to the text of the raw UTF-8 octets.
Fetch the public suffix list plaintext file from the URL returned by
url. Returns a scalar reference to the text of the raw UTF-8 octets.
If you've set
cache_dirin the object, this method attempts to cache the response in that directory using
default_local_fileas the filename. This cache is different than
local_filealthough you can use it as
Return the configured URL for the public suffix list.
Return the default URL for the public suffix list.
Return the configured local path for the public suffix list.
Return the default local path for the public suffix list.
Return the configured filename for the public suffix list.
Return the default filename for the public suffix list.
This source is in Github:
brian d foy,
Copyright © 2020-2021, brian d foy, All Rights Reserved.
You may redistribute this under the terms of the Artistic License 2.0.