NAME

Bio::ToolBox::utility - common utility functions for Bio::ToolBox

DESCRIPTION

These are general subroutines that don't fit in with the other modules.

REGULAR SUBROUTINES

The following subroutines are automatically exported when you use this module.

parse_list
        my $index_request = '1,2,5-7';
        my @indices = parse_list($index_request); # returns [1,2,5,6,7]

This subroutine parses a scalar value into a list of values. The scalar is a text string of numbers (usually column or dataset indices) delimited by commas and/or including a range. For example, a string "1,2,5-7" would become an array of [1,2,5,6,7].

Pass the module the scalar string.

It will return the array of numbers.

format_with_commas
        my $count = '4327908475';
        printf " The final count was %s\n", format_with_commas($count);

This subroutine process a large number (e.g. 4327908475) into a human-friendly version with commas delimiting the thousands (4,327,908,475).

Pass the module a scalar string with a number value.

It will return a scalar value containing the formatted number.

ask_user_for_index
        my @answers = ask_user_for_index($Data, 'Please enter 2 or more columns   ');

This subroutine will present the list of column names from a Bio::ToolBox::Data structure along with their numeric indexes to the user and prompt for one or more to be selected and entered. The function is smart enough to only print the list once (if it hasn't changed) so as not to annoy the user with repeated lists of header names when used more than once. A text prompt should be provided, or a generic one is used. The list of indices are validated, and a warning printed for invalid responses. The responses are then returned as a single value or array, depending on context.

simplify_dataset_name
        my $simple_name = simplify_dataset_name($dataset);

This subroutine will take a dataset name and simplify it. Dataset names may often be file names of data files, such as Bam and bigWig files. These may include a file:, http:, or ftp: prefix, one or more directory paths, and one or more file name extensions. Additionally, more than one dataset may be combined, for example two stranded bigWig files, with an ampersand. This function will safely remove the prefix, directories, and everything after the first period.

LEGACY SUBROUTINES

These are additional functions that can be optionally exported. These provide accessibility to the Bio::ToolBox::Data::file functions that might be needed for old scripts that do not implement Bio::ToolBox::Data objects. You normally should not need these. If you import these, be sure to import the ones above if you need those too.

open_to_read_fh

Wrapper around "open_to_read_fh" in Bio::ToolBox::Data::file. Opens a file as an IO::Handle read only object. Transparently handles gzip and bzip2 compression.

open_to_write_fh
   my $fh = open_to_write_fh($file, $gz, $append);

Wrapper around "open_to_write_fh" in Bio::ToolBox::Data::file. Opens a file as an IO::Handle write only object. Pass the file name as the option. Optionally provide a boolean value if you want the file to be written as a compressed gzip file. Pass another boolean value if you want to append to an existing file; otherwise an existing file with the same name will be overwritten!

check_file

Wrapper around the "check_file" in Bio::ToolBox::Data::file method. Checks to see if a file exists. If not, some common missing extensions are appended and then existence is re-checked. If a file is found, the name is returned so that it could be opened. Useful, for example, if you forget the .txt or .gz extensions.

AUTHOR

 Timothy J. Parnell, PhD
 Dept of Oncological Sciences
 Huntsman Cancer Institute
 University of Utah
 Salt Lake City, UT, 84112

This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.