Code::CutNPaste - Find Duplicate Perl Code


Version 0.31


    use Code::CutNPaste;

    my $cutnpaste = Code::CutNPaste->new(
        dirs         => [ 'lib', 'path/to/other/lib' ],
        renamed_vars => 1,
        renamed_subs => 1,
    my $duplicates = $cutnpaste->duplicates;

    foreach my $duplicate (@$duplicates) {
        my ( $left, $right ) = ( $duplicate->left, $duplicate->right );
        printf <<'END', $left->file, $left->line, $right->file, $right->line;

    Possible duplicate code found
    Left:  %s line %d
    Right: %s line %d

        print $duplicate->report;


ALPHA code, though it works fairly well. You probably want use the find_duplicate_perl command line program that ships with this distribution.

A simple, heuristic code duplication checker. Will not work if the code does not compile. See the find_duplicate_perl program which is installed with it.

Attributes to constructor


An array ref of dirs to search for Perl code. Defaults to 'lib'.


An array ref of files to be examined (will be added to dirs, above).


Will report duplicates even if variables are renamed.


Will report duplicates even if subroutines are renamed.


Minumum number of lines to compare between files. Default is 5.


This code can be very slow. If verbose is true, will print a progress bar to STDERR. The progress bar has an ETA, but this number seems to be fairly unreliable. Maybe I'll remove it.


Takes an integer. Defaults to 1. This is the number of jobs we'll try to run to gather this data. On multi-core machines, you can easily use this to max our your CPU and speed up duplicate code detection.


A number between 0 and 1. It represents a percentage. If a duplicate section of code is found, the percentage number of lines of code containing "word" characters must exceed the threshold. This is done to prevent spurious reporting of chunks of code like this:

         };          |         };
     }               |     }
     return \@data;  |     return \@attrs;
 }                   | }
 sub _confirm {      | sub _execute {

The above code has only 40% of its lines containing word (qr/\w/) characters, and thus will not be reported.


Boolean. Default false.

Due to a bug in Perl, the following code crashes Perl in Windows:

 perl -e "use open qw{:encoding(UTF-8) :std}; fork; "
 perl -e "open $f, '>:encoding(UTF-8)', 'temp.txt'; fork"
 perl -e "use utf8::all; fork"

By setting noutf8 to a true value, we avoid loading utf8::all. This may cause undesirable results.

See also:


By default, we cache "deparsed" versions of the code in <$ENV{HOME}/.cutnpaste>. You can use this attribute to specify a different cache directory.


A boolean. If true, will display some internal warnings when trying to deparse files. It's used for debugging, but you may find it useful. Largely gets triggered when you try to search for duplicates in a file that you already have in memory, or when the file in question cannot otherwise be deparsed.


Takes an arrayref of regular expressions. Blocks of code matching any of the regular expressions will not be reported as duplicates.


  • Add Levenstein edit distance

  • Mask off strings

    It's amazing how many strings I'm finding which hide duplicates.

  • Check files against themselves

    Currently, we only check for duplicates in other files. Whoops!

  • We need a way to skip modules

    This is very important for code bases with auto-generated modules. They don't care as much about duplicated code.

  • A config file?


Curtis "Ovid" Poe, <ovid at>


Please report any bugs or feature requests to bug-code-cutnpaste at, or through the web interface at I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.


You can find documentation for this module with the perldoc command.

    perldoc Code::CutNPaste

You can also look for information at:



Copyright 2012 Curtis "Ovid" Poe.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See for more information.