Scott Cain | 23 Apr 18:28 2014

Recent BP and GBrowse/JBrowse

Hi all,

I've noticed that there seems to be a problem with recent BioPerl releases
and GBrowse and JBrowse.  When the tests run for G/JBrowse, we get messages
like this:

I don't think the problem is coming from Bio::DB::SeqFeature::Store::memory
since that hasn't changed in 3 years.  Could Bio::DB::IndexedBase be the
problem?  I see this behavior in 1.6.923 and 1.6.922 but not in 1.6.901.


Scott Cain, Ph. D.                                   scott at scottcain dot
GMOD Coordinator (                     216-392-3087
Ontario Institute for Cancer Research
Rik Rademaker | 21 Apr 16:55 2014

Writing and retrieving Genbank files from BioSQL

Dear all,

I am a biologist trying to write genbank files to bioSQL. I am comfortable 
in writing python scripts but there is a problem with BioPython and that 
is  that the molecule type in the locus line is lost (eg 'circular DNA' 
becomes just 'DNA'). I am now trying to figure out how BioPerl is doing 
this and how BioPerl is writing this information to BioSQL.

I have a BioSQL database (MySQL) and I can commit to BioSQL eg via this 

use strict;

use Bio::DB::BioDB; 
use Bio::DB::GenBank;

#Load Genbank file
my $genbank_id = 'L08752';

my $genDB = new Bio::DB::GenBank;
my $sequence = $genDB->get_Seq_by_id($genbank_id);

my $db=Bio::DB::BioDB->new(-database => 'BioSQL',
                           -user => 'root',
                           -dbname => 'bioseqdb',
                           -host => 'localhost',
                           -driver => 'mysql');

my $pobj = $db->create_persistent($sequence);
(Continue reading)

Warren Gallin | 18 Apr 20:51 2014

Constructing split features

It occurs to me that if one could create a Bio::SeqFeature object using the information in a coded_by tag
from a CDS Bio::SeqFeature in a protein record, then it should be possible to add that new feature to a copy
of the underlying nucleotide sequence (just pull the sequence from GenBank in fast format, create a
temporary new GenBank format Bio::Seq object, add the single new feature and use it to extract the spliced sequence).

Is this even possible?  Or would you have to strip out all the information on the various spliced locations
from the coded-by tag and rebuild a new Bio::SeqFeature from scratch?  Or is this even a reasonable
approach to creating an automated way of getting coding sequences?

Warren Gallin
Warren Gallin | 17 Apr 23:26 2014

Another problem with a joined feature

I have encountered another problem with several GenBank nucleotide sequence records.

The exemplar is accession number AF071478.1 which is a sequence of a specific human exon, annotated with a
CDS feature that describes the whole protein sequence  of which this exon anodes only a part.

The CDS feature is described as a join of pieces of sequence from this record and several other records. 
However, when I try to obtain the full css I am only getting the part of the sequence from the AF071478.1
record, not the others. 

The following is a minimal script that illustrates this behaviour.

My understanding was that if a feature was described as a join of multiple nucleotide sequence records then
the spliced sequence() method with a handle to a GenBank created using Bio::DB::GenBank should also
collect and splice in the fragments of nucleotide sequence from other GenBank records to create the final
spliced sequence.  Could someone explain where my understanding is going awry?

Once again, thanks for any help.

Warren Gallin



use strict;
use warnings;
use DBI;
use Bio::Seq;
use Bio::DB::EUtilities;
use Bio::SeqIO;
(Continue reading)

Pau Marc Muñoz Torres | 16 Apr 15:43 2014

Running a blast with bioperl and blosum 45

Good afternoon

 I wrote a script to run a number of blast automatically. I found that is
possible to change the table used by


I managed to do it, but I can't  use nor BLOSUM45 neither PAM250 tables.
What can I do to use those tables?


Pau Marc Muñoz Torres
skype: pau_marc

Bioperl-l mailing list
Bioperl-l <at>
Warren Gallin | 16 Apr 07:37 2014

Possible Repeat E-Mail


	My previous message bounced, presumably because I included an attachment.

	On the chance that it did not make it through, here is the relevant test case:

A script called is as follows:



use strict;
use warnings;
use DBI;
use Bio::Seq;
use Bio::DB::EUtilities;
use Bio::SeqIO;
use Bio::Seq;
use Data::Printer;
use Bio::DB::GenBank;

my $gi = 302393575;  #This gi number is for the protein record of a horse ion channel
my $spliced_cds;
my $na_seq;
my %na_vkcnt_id;

#Create a database handle to GENBANK for retrieving coding sequences

my $gb_db = Bio::DB::GenBank->new();
(Continue reading)

Warren Gallin | 15 Apr 20:39 2014

Getting coding sequence starting with a protein record

I am having a problem finding a general method of recovering the nucleotide coding sequence for a protein
sequence record.

Generally tracking the CDS annotation back to the nucleotide sequence record using the accession number
of the nucleotide sequence is working.

One problem arises when the underlying coding sequence is spliced from multiple nucleotide records.  Is
there a general approach to automatically track down and joint the different sequence fragments from
different sequence entries?  An example of the problem can be seen if you start from the protein record with
GI number 7715882.  It is annotated as coming from three different nucleotide records.  Is there an
approach in Bioperl that will detect and download these three records and splice together the
appropriate parts to get the coding sequence?

The other problem that I am having is the ongoing issue of protein records annotated as highly redundant
sequences , with WP-XXXXXX accession numbers.  Has anyone found a way to retrieve the set of different
nucleotide sequences that all encode a single AP-annotated protein sequence?

Any help would be appreciated,

Warren Gallin
Yoshiro Nagao | 14 Apr 08:06 2014

about Open-Source Genomic Analysis

Dear all,

I came across the following article

N Engl J Med. 2011 Aug 25;365(8):718-24.
Open-source genomic analysis of Shiga-toxin-producing E. coli O104:H4.

in which the authors requested bioinformatical analyses worldwide 

The detail is available from:

I am interested in this way of outsourcing bioinformatical analysis, which
may find bioinformatician(s) that are most appropriate for a specific problem.

How could such a request of outsourced bioinformatical analysis be made,
and to what extent? 
Are there any ethical consideration? 

Any information would be appreciated,

Yoshiro Nagao
Warren Gallin | 31 Mar 21:48 2014

Eutilities not responding


	My scripts, which were working last night are now failing with a time-out message as follows:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Response Error
Status read failed: Operation timed out
STACK: Error::throw
STACK: Bio::Root::Root::throw /opt/local/lib/perl5/site_perl/5.12.3//Bio/Root/
STACK: Bio::DB::GenericWebAgent::get_Response /opt/local/lib/perl5/site_perl/5.12.3//Bio/DB/
STACK: Bio::DB::EUtilities::get_Response /opt/local/lib/perl5/site_perl/5.12.3//Bio/DB/
STACK: Bio::DB::EUtilities::get_Parser /opt/local/lib/perl5/site_perl/5.12.3//Bio/DB/
STACK: Bio::DB::EUtilities::next_History /opt/local/lib/perl5/site_perl/5.12.3//Bio/DB/
STACK: NCBI_Retrival::eutilities_getdata

When I try to ping all the packets are lost, no returns whatsoever.

Can anyone else reproduce this?  I can not tell if it is something local with me or something happening on the
NCBI side of things.


Warren Gallin
dayong guo | 31 Mar 19:34 2014

online cas9 design tool

Dear All,

COD (Cas9 & Off-target Designer) is a perl 
constructed online tool to design Cas9 RNA directed DNA nuclease. I made it 
as a spare time hobby. Please try it if you need, and let me know your 

Thanks a lot for the bioperl google group!

Dave Messina | 29 Mar 03:00 2014

Fwd: edit to

Hi Volker,

Your revert does work for yn00 output, but unfortunately breaks the code
for another type of output (see Bug 3332,, one of many horrendous format
issues PAML brings.

I've committed a fix (12ddb53) that does special case parsing for yn00. I
haven't tested it extensively, so please don't hesitate to try it out and
submit an improved version via pull request at github if need be.


On Thu, Mar 27, 2014 at 6:01 PM, Jason Stajich <jason <at>> wrote:

> Hi - great to hear that it is still useful - I have long stopped having
> time to try and track the versions of PAML and the output changes so that
> our parser will work better.
>  I am CC-ing Dave Messina who was the last person to touch this module I
> believe as he may be able to take a look at it and contribute - this looks
> extra simple so it may be easy enough for Dave or someone else to do this
> fix.
>  I would encourage you to contribute fixes directly in the future if you
> can at the github repository -- esp if you are using the module and are
> undoubtable hitting some issues with the format.
> Thanks,
(Continue reading)