Stephen Gordon Lenk | 1 Sep 01:34 2005

Bioperl adds utility to msaexcise script


I adopted Heikki Lehvaslaiho's ideas. The script now reads/writes multiple formats based on users
request on 
command line. Thanks Bioperl developers! Snippet below shows use of AlignIO. I'll work on better/more
flexible 
data mining for CD regions next. I'd like to be able to use multiple types of protein id's as are in user's 
alignment and get CD for it.

eval {

	##############
	# input stream
	##############
	
	use Bio::AlignIO;
	my $in = Bio::AlignIO->new( -fh        => \*STDIN,
	                                             -format => $informat ) -> next_aln();
	
	########## 
	# excision
	##########
	
	my $out = _excise( $max_e_value, 
					   $use_all_cds,
					   $full_excise,
					   \ <at> excise_cd,
					   $in );
	
	
	###############
(Continue reading)

Barry Moore | 1 Sep 19:33 2005
Picon

RE: Bio::Search results

Sean-

The top of you input hmmpfam file has these four lines:

hmmpfam - search one or more sequences against HMM database HMMER 2.3.2
(Oct 2003) Copyright (C) 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License (GPL)

When I copied that out of you e-mail and into emacs on my computer I got
it all as one line like this:

hmmpfam - search one or more sequences against HMM database HMMER 2.3.2
(Oct 2003) Copyright (C) 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License (GPL)

The parser needs to see a line beginning with HMMER to trigger setting
an internal variable called _reporttype (among other things).  Since
this never happened with a single line header I got the same results you
did.  Your header lines look OK in the e-mail you sent me, but assuming
you have the same problem with you header lines that I did, fixing that
should lead you to your next problem...

This line of your code has an error:

next unless ($hsp->name =~ /^ig|^lrr|^fn3|^egf|^tsp|^psi/i);

HSP objects don't have names but their associated hit objects do, so
this makes your script work as intended:

next unless ($hit->name =~ /^ig|^lrr|^fn3|^egf|^tsp|^psi/i);
(Continue reading)

Paul G Cantalupo | 1 Sep 20:58 2005
Picon

RemoteBlast save_output does not handle -m8 output correctly

Hello,

I'm using bioperl-1.4 and I noticed that RemoteBlast save_output method 
creates an empty file (zero bytes) when dealing with blast -m8 output 
(i.e. ALIGNMENT_VIEW = Tabular). Is there a fix for this?

Thank you,

Paul

Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759

Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer
Jason Stajich | 1 Sep 21:21 2005

Re: RemoteBlast save_output does not handle -m8 output correctly

You'll need to fix this logic which expects to see a BLAST HEADER  
before it starts writing.  If someone post a patch we can apply the fix.

    while(my $l = <TMP>) {
                 next if ($l =~ /<pre>/);
                 if( $l =~ /^(?:[T]?BLAST[NPX])\s*.+$/i ||
                          $l =~/^RPS-BLAST\s*.+$/i ) {
                         $seentop=1;
                 }
                 next if !$seentop;
                 if( $seentop ) {
                         print SAVEOUT $l;
                 }
         }

On Sep 1, 2005, at 2:58 PM, Paul G Cantalupo wrote:

> Hello,
>
> I'm using bioperl-1.4 and I noticed that RemoteBlast save_output  
> method creates an empty file (zero bytes) when dealing with blast - 
> m8 output (i.e. ALIGNMENT_VIEW = Tabular). Is there a fix for this?
>
> Thank you,
>
> Paul
>
> Paul Cantalupo
> Research Specialist/Systems Programmer
> 559 Crawford Hall
(Continue reading)

Paul G Cantalupo | 1 Sep 21:52 2005
Picon

RemoteBlast retrieve_blast cannot return Bio::SearchIO::blasttable

Hello,

When using RemoteBlast with ALIGNMENT_VIEW = 'Tabular' (-m8 output) the 
method 'retrieve_blast' should return a Bio::SearchIO::blasttable object 
to handle the parsing of -m8 output. I think the easiest way to do this is 
by changing 'retrieve_blast'

FROM:

if( $self->readmethod =~ /BPlite/ ) {
    $blastobj = new Bio::Tools::BPlite(-file => $tempfile);
} else {
    $blastobj = new Bio::SearchIO( -file => $tempfile,
                                   -format => 'blast');
}

TO:

if( $self->readmethod =~ /BPlite/ ) {
    $blastobj = new Bio::Tools::BPlite(-file => $tempfile);
} elsif( $self->readmethod =~ /blasttable/ ) {
    $blastobj = new Bio::SearchIO( -file => $tempfile,
                                   -format => 'blasttable');
} else {
    $blastobj = new Bio::SearchIO( -file => $tempfile,
                                   -format => 'blast');
}

Any suggestions/criticisms?

(Continue reading)

Barry Moore | 2 Sep 04:23 2005
Picon

RE: Bio::Search results

Sean I see that my mailer messed up the line formatting also.  What I
meant was you should have the top four lines like this:

#Line 1: hmmpfam - search one or more sequences against HMM database 
#Line 2: HMMER 2.3.2(Oct 2003) 
#Line 3: Copyright (C) 1992-2003 HHMI/Washington University School of
#Line 4: Medicine Freely distributed under the GNU General Public
License (GPL)

If you have that, and fix the hsp to hit error mentioned it works for
me.

Barry

-----Original Message-----
From: bioperl-l-bounces <at> portal.open-bio.org
[mailto:bioperl-l-bounces <at> portal.open-bio.org] On Behalf Of Barry Moore
Sent: Thursday, September 01, 2005 11:34 AM
To: Bioperl List
Subject: RE: [Bioperl-l] Bio::Search results

Sean-

The top of you input hmmpfam file has these four lines:

hmmpfam - search one or more sequences against HMM database HMMER 2.3.2
(Oct 2003) Copyright (C) 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License (GPL)

When I copied that out of you e-mail and into emacs on my computer I got
(Continue reading)

Andrew Leung | 2 Sep 06:02 2005
Picon

How to assess sequencing quality with Bioperl?

Hello,
How to read trace files and report sequencing quality using Bioperl? Is
there anyone who can give me some hints on what Bioperl modules I can use
and what I could obtain from using them? I've googled the Bioperl site for
this topic, but felt fluctuated.
Thank you in advance.
Andrew
Heikki Lehvaslaiho | 2 Sep 10:48 2005
Picon
Picon

Forthcoming EMBL databank changes

FYI:

There are number of planned changes that will affect the EMBL nucleotide 
databank:

http://www.ebi.ac.uk/embl/Documentation/forthcomingchanges.html

Most of them are minor but will be effective this year, but the one planned 
for June 2006 removes the ID completely:

http://www.ebi.ac.uk/embl/Documentation/changesdetails.html

Peter Rice has started  a discussion on its implications on the emboss mailing 
list: 
http://newportal.open-bio.org/pipermail/emboss/2005-September/002203.html

Perhaps we should pay attention! ;-)

 -Heikki

--

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_ebi _ac _uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambridge, CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
(Continue reading)

Sean O'Keeffe | 2 Sep 11:31 2005
Picon

Re: Bio::Search results

I have those four lines (all contained on separate lines) but it didn't work . 

I ran "gawk '{if(/^\/\/$/) {print "//\n\nHMMER 2.3.2 (Oct 2003)"} else
{print}}' my_file > whatever" on the pfam file to include a HMMER
header after the // separator for each result.

It now works fine. I still don't understand why I need to include this
header to parse the file. Although from memory don't Blast reports all
include Blast header lines to separate next results?
By the way I think that its the mailer that seems to be altering the
text in these scripts (don't know where the "=3D" came from - it's not
in my original). I'm running a standard linux (suse 9.0) using nedit
as an editor.
Sean.

On 9/2/05, Barry Moore <bmoore <at> genetics.utah.edu> wrote:
> Sean I see that my mailer messed up the line formatting also.  What I
> meant was you should have the top four lines like this:
> 
> #Line 1: hmmpfam - search one or more sequences against HMM database
> #Line 2: HMMER 2.3.2(Oct 2003)
> #Line 3: Copyright (C) 1992-2003 HHMI/Washington University School of
> #Line 4: Medicine Freely distributed under the GNU General Public
> License (GPL)
> 
> If you have that, and fix the hsp to hit error mentioned it works for
> me.
> 
> Barry
> 
(Continue reading)

Nathan Haigh | 2 Sep 11:01 2005
Picon

RE: How to assess sequencing quality with Bioperl?

I'm sure other people will have better, more indepth help, but to get you
started, I can point you in the right direction - but I haven't had to deal
with trace files myself - yet!

You will need to use Bio::SeqIO. The following HowTO shows the supported
formats inc. trace file formats: http://bioperl.org/HOWTOs/html/SeqIO.html. 

Have a read through the Bio::SeqIO.pm file, there is mention of needing to
install the Staden "io_lib" package, as well as the Bio::SeqIO::staden::read
package available from the bioperl-ext repository.

Nathan

-----Original Message-----
From: bioperl-l-bounces <at> portal.open-bio.org
[mailto:bioperl-l-bounces <at> portal.open-bio.org] On Behalf Of Andrew Leung
Sent: 02 September 2005 05:02
To: bioperl-l <at> portal.open-bio.org
Subject: [Bioperl-l] How to assess sequencing quality with Bioperl?

Hello,
How to read trace files and report sequencing quality using Bioperl? Is
there anyone who can give me some hints on what Bioperl modules I can use
and what I could obtain from using them? I've googled the Bioperl site for
this topic, but felt fluctuated.
Thank you in advance.
Andrew

_______________________________________________
Bioperl-l mailing list
(Continue reading)


Gmane