NAME

Bio::Grep::Backends::Vmatch - Vmatch back-end

SYNOPSIS

  use Bio::Grep::Backends::Vmatch;
  
  use Bio::Root::Exception;
  use Error qw(:try);
  
  # configure our search back-end, in this case Vmatch
  my $sbe = Bio::Grep::Backends::Vmatch->new();
  
  $sbe->settings->execpath('/usr/local/vmatch');
  $sbe->settings->tmppath('/tmp');
  $sbe->settings->datapath('data');
  
  # generate a Vmatch suffix array. you have to do this only once.
  $sbe->generate_database_out_of_fastafile('ATH1.cdna', 'AGI Transcripts (- introns, + UTRs)');
  
  my %local_dbs_description = $sbe->get_databases();
  my @local_dbs = sort keys %local_dbs_description;
  
  # take first available database in our test
  $sbe->settings->database($local_dbs[0]);
  
  # search for the reverse complement and allow 4 mismatches
  $sbe->settings->query('UGAACAGAAAGCUCAUGAGCC');
  $sbe->settings->reverse_complement(1);
  $sbe->settings->mismatches(4);
  
  # With many mismatches and short queries, the "online" algorithm
  # is maybe faster. This alogrithm does not use the index. Test this!
  # $sbe->settings->online(1);

  # if you don't need upstream/downstream options, set showdesc
  # (see vmatch manual) for performance reasons 
  $sbe->settings->showdesc(100)
  
  $sbe->search();

  # output the searchresults with nice alignments
  while ( my $res = $sbe->next_res ) {
     print $res->sequence->id . "\n";
     print $res->mark_subject_uppercase() . "\n";
     print $res->alignment_string() . "\n\n";
  }
  

DESCRIPTION

Bio::Grep::Backends::Vmatch searches for a query in a Vmatch suffix array.

METHODS

See Bio::Grep::Backends::BackendI for other methods.

Bio::Grep::Backends::Vmatch->new()

This function constructs a Vmatch back-end object

   my $sbe = Bio::Grep::Backends::Vmatch->new();
$sbe->available_sort_modes()

Returns all available sort modes as hash. keys are sort modes, values a short description.

   $sbe->sort('ga');

Available sortmodes in Vmatch:

                ga  : 'ascending order of dG'
                gd  : 'descending order of dG'
                la  : 'ascending order of length'
                ld  : 'descending order of length'
                ia  : 'ascending order of first position'
                id  : 'descending order of first position'
                ja  : 'ascending order of second position'
                jd  : 'descending order of second position'
                ea  : 'ascending order of Evalue'
                ed  : 'descending order of Evalue'
                sa  : 'ascending order of score'
                sd  : 'descending order of score'
                ida : 'ascending order of identity'
                idd : 'descending order of identity'

Note that 'ga' and 'gd' require that search results have dG set. Bio::Grep::RNA ships with filters for free energy calculation. Also note that these two sort options require that we load all results in memory.

$sbe->get_sequences()

Takes as argument an array reference. If first array element is an integer, then this method assumes that the specified sequence ids are Vmatch internal ids. Otherwise it will take the first array element as query.

    # get sequences 0,2 and 4 out of suffix array
    $sbe->get_sequences([0,2,4]);

    # get sequences that start with At1g1
    $sbe->get_sequences(['At1g1', 'ignored']);

DIAGNOSTICS

See Bio::Grep::Backends::BackendI for other diagnostics.

Bio::Root::SystemException
Vmatch error: Query not valid ...

It was not possible to run Vmatch in function search. Check the search settings.

Vmatch error: Cannot generate suffix array

It was not possible to generate a suffix array in function generate_database_out_of_fastafile. Check permissions and paths.

Vmatch error: Cannot fetch sequence out of suffix array

It was not possible to get some sequences out of suffix array in function get_sequences. Check sequence ids.

Bio::Root::BadParameter
You can't use showdesc() with upstream or downstream.

We need the tool vsubseqselect of the Vmatch package for the upstream and downstream regions. That tools requires as parameter an internal vmatch sequence id, which is not shown in the Vmatch output when showdesc is on.

You have to specify complete or querylength. ...'

The Vmatch parameters -complete and -l cannot combined. See the Vmatch documentation.

Vmatch: You can't combine qspeedup and complete

The Vmatch parameters -complete and -qspeedup cannot combined. See the Vmatch documentation.

unsupported alphabet of file

The method generate_database_out_of_fastafile() could not determine the alphabet (DNA or Protein) of the specified Fasta file.

It is too slow, if I call vmatch on the command line, it is much faster!

Did you set showdesc(100)? Yes? Write a bugreport!

SEE ALSO

Bio::Grep::Backends::BackendI Bio::Grep::Container::SearchSettings Bio::SeqIO

AUTHOR

Markus Riester, <mriester@gmx.de>

LICENCE AND COPYRIGHT

Based on Weigel::Search v0.13

Copyright (C) 2005-2006 by Max Planck Institute for Developmental Biology, Tuebingen.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.