USAGE <infiles> --id-regex=<str> [optional arguments]



Path to input FASTA files [repeatable argument].


Regular expression for capturing the original seq id.

The argument value can be either a predefined regex or a custom regex given on the command line (do not forget to escape the special chars then). The following predefined regexes are available (assuming a leading '>'):

    - :DEF (first stretch of non-whitespace chars)
    - :GI  (number nnn in  gi|nnn|...)
    - :GNL (string xxx in gnl|yyy|xxx)
    - :JGI (number nnn in jgi|xxx|nnn or jgi|xxx|nnn|yyy)
    - :PAC (number nnn in xxx|PACid:nnn)



Optional output dir that will contain the abbreviated FASTA files (will be created if needed) [default: none]. Otherwise, output files are in the same directory as input files.


Path to an optional IDM file explicitly listing the infile => prefix pairs. Useful in the context of processing multiple input files. This argument and the next one (--id-prefix) can be both specified together. In such a case, however, a single pipe char is appended to the combined prefix.


String to use as the seq id prefix (e.g., NCBI taxon id, 4-letter code) [default: none].


Store the IDM file corresponding to each output file [default: no].


Print the usual program information