++ed by:
MARKELLIS CKRAS ASHLEY EMAZEP KEEDI
6 non-PAUSE users
Author image Marvin Humphrey
and 1 contributors

NAME

extract_reuters.plx - parse Reuters 21578 corpus into individual files

SYNOPSIS

    ./extract_reuters.plx /path/to/expanded/reuters/archive

DESCRIPTION

This script will extract TITLE and BODY for each item in the Reuters 21578 corpus into individual files. It expects to be passed the location of the decompressed archive as a command line argument.