Cacau Centurion | 10 Oct 19:46 2014
Picon

Problems when using PAML::yn00

Hi All,

I tried to use PAML:Yn00 to run yn00 and parse the result. However, no results were given. Does anyone know what might be the problem?

The following code is obtained from 

use Bio::Tools::Run::Phylo::PAML::Yn00;
use Bio::AlignIO;
my $alignio = Bio::AlignIO->new(
    -format => 'fasta',
    -file   => "$ARGV[0]"
);
my $aln = $alignio->next_aln;

my $yn = Bio::Tools::Run::Phylo::PAML::Yn00->new();
$yn->alignment($aln);
my ( $rc, $parser ) = $yn->run;
while ( my $result = $parser->next_result ) { 
    my <at> otus     = $result->get_seqs();
    my $MLmatrix = $result->get_MLmatrix();

    #0 and 1 correspond to the 1st and 2nd entry in the <at> otus array
    my $dN   = $MLmatrix->[0]->[1]->{dN};
    my $dS   = $MLmatrix->[0]->[1]->{dS};
    my $kaks = $MLmatrix->[0]->[1]->{omega};
    print "Ka = $dN Ks = $dS Ka/Ks = $kaks\n";
}


###########################################################
Alignment:
>1
aaattgttgttg
>2
aacaatttgttg


Yours,
Cacau
_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Andreas Prlic | 7 Oct 20:52 2014
Picon

The NIH Software Discovery Index | We invite your comments -- a system for linking software, publications and users in the research community.

Greetings Everyone,

 

On behalf of a number of software developers, end-users, publishers associated with the scientific analysis community, we would like to invite all of you to review a document generated as a result of a NIH BD2K supported meeting that focused on the opportunities and challenges of developing a software management ecosystem that could be valuable for finding and linking software, publications and users in the research community. You may be also be aware of a related project, the Data Discovery Index, which will be fully integrated with the software system.

The product of this workshop and the subsequent discussion is a document which details the opportunities and challenges of developing a Software Discovery Index that would enable researchers to find, cite, and link software and analysis tools publications and researchers. To ensure that the opportunities, challenges, and recommendations detailed in the document reflect the breadth of experience from the community, we are seeking your input.  In conjunction with related efforts already under way at NIH, including the development of a Data Discovery Index, the final document will be used by the NIH Office of the Associate Director (ADDS) to inform a strategy for the development of a Software Discovery Index and a commons ecosystem for data, software, and resources.

 

We need your help to ensure that this critical task is achieved: to guide the development of a community based system that gives credit and acknowledgment to the builder and maintainers of the software we all depend on! We invite all users, software developers, publishers, and software repository administrators to review our report prior to its submission to the NIH. Please complete your review and post comments by November 1, 2014.

 

The link to the report is here: http://softwarediscoveryindex.org

 

On behalf of the organizing committee, thank you for your assistance!

 

Organizing Committee

 

Owen White

Director of Bioinformatics, University of Maryland, Baltimore, School of Medicine

Co-Chair of NIH BD2K  Software Index Workshop

 

Asif Dhar

Principal & Chief Medical Informatics Officer

Co-Chair of NIH BD2K  Software Index Workshop

 

Vivien Bonazzi

Senior Advisor for Data Science Technologies (ADDS)

Co-Chair of BD2K Software and Methods Group

 

Jennifer Couch

Chief, Structural Biology and Molecular Applications Branch

NCI Co-Chair of BD2K Software and Methods Group

 

Chris Wellington

Program Director (NHGRI)

 

_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Alexey Morozov | 30 Sep 08:20 2014
Picon

Getting pairwise alignment scores for existing multiple alignment

Dear colleagues,
Is there a method in bioperl that will calculate pairwise alignment scores for any given pair of genes in MSA (according to a given matrix and gap opening/extension cost)? It seems that Bio::SimpleAlign methods only work with score if it has been described in MSA file and can only hold a general multiple sequence alignment score.

--
Alexey Morozov,
LIN SB RAS, bioinformatics group.
Irkutsk, Russia.
_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Mark A. Jensen | 30 Sep 05:41 2014
Picon

Spankin' new (alpha) build system for Bioperl-Run

All (esp. George)-

My work on Issue #11 (https://github.com/bioperl/bioperl-run/issues/11) 
has metastasized.

The proximate problem was tests that fail because of once-local 
prerequisites. The ultimate
problems are

- Why should I have to install every single wrapper when I only want X?
- Why should I care about any test that doesn't deal with X?
- Why doesn't X bring along its own prereq metadata (including Bio 
prereqs),
   rather than tag along with the distro and hope for the best?

(And I think these are the ultimate problems across BioPerl in terms of 
decentralized
distribution.)

My solution was

- Add to the distro real, manually prepared metadata on prerequisites 
for all
   the tools
- Add an interactive selector that allows a user to pick their desired 
tools at
   perl Build.PL-time
- Have Module::Build check only (and ALL) the prereqs of the desired 
tools, and
   inform user of missing ones at perl Build.PL-time
- Make use of the persistence of the config information to skip/run .t 
files as
   appropriate
- Update ALL the tests to check whether to skip based on user selection
- Make M::B install only the relevant distro modules and documentation, 
not everything,
   at ./Build install-time

This is ready for brave alpha-testers at 
https://github.com/bioperl/bioperl-run/tree/topic/issue11.
Just do 'perl Build.PL'.

Pod below has some more details-- comments very welcome

MAJ

NAME
     Bio::Tools::Run::Build - Instrument the build for features

SYNOPSIS

...

DESCRIPTION
     Bio::Tools::Run::Build is a subclass of Module::Build that allows 
an
     author to offer users the ability to select and install 
pre-configured
     subsets of modules that are packaged in a single large M::B-based
     distribution.

     Grouping and selection of distro modules is driven by the optional
     features concept as defined in CPAN::Meta::Spec and used by
     Module::Build.

     The subclass provides the following:

     *   Author specification of features and their prereqs

         The build author develops metadata files in json that follow
         "optional_features" in CPAN::Meta::Spec to group distribution
         modules and dependencies as selectable features.

     *   Interactive user selection of features

         The user can be presented with an interactive selector during
         Build.PL runs.

     *   Prereq checking of user selected features only

         M::B only checks for the presence of selected feature 
dependencies.

     *   Build-persistent recording of user selections

         The build object records the selection of features in the
         $build->feature field. This can be used in test files to 
determine
         whether tests should be skipped (and not failed). See
         Bio::Tools::Run::Build::Test.

     *   Installation only of selected feature modules

         Bio::Tools::Run::Build adds a build action, "deselect", which 
runs
         after the "code" and "docs" actions. "deselect" removes 
unselected
         modules from the blib/lib directory and unneeded documentation 
from
         the blib/libdoc directory. This keeps the "install" action from
         installing unwanted files.

MOTIVATION
     The BioPerl-Run distribution contains a large variety of wrappers 
and
     parsers that handle the execution and output of many different
     bioinformatics tools. It has been provided as a large distro that
     installs and attempts to test all of its modules. Many users need 
only a
     small fraction of the functionality BioPerl-Run provides, relevant 
only
     to the tools they have installed. On the other hand, managing many
     different packages is unwieldy and uninviting for volunteer 
maintainers.

     The system described here is a compromise that enables a user to 
select,
     test and install only those modules that meet the need, yet reduces 
the
     maintenance effort to the management of a set of metadata files in 
a
     single distribution.

...
Adam Sjøgren | 29 Sep 17:17 2014
X-Face

Invalid EMBL files generated in rare circumstances; line wrapping

  Hi.

If you craft a tag on a feature sneakily (or if you are unlucky)
Bio::SeqIO will create invalid EMBL, separating the "/" from the
qualifier name:

    ID   unknown; SV 1; linear; unassigned DNA; STD; UNC; 4 BP.
    XX
    AC   unknown;
    XX
    XX
    XX
    FH   Key             Location/Qualifiers
    FH
    FT   CDS             1..4
    FT                   /
    FT                   note="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    FT                   X"
    XX
    SQ   Sequence 4 BP; 1 A; 1 C; 1 G; 1 T; 0 other;
         actg                                                                      4
    //

In this example "/" and "note" are on separate lines, which is wrong; at
least BioPerl does not accept it itself.

Here is a script to create the above output (BioPerl 1.6.901 used):

    #!/usr/bin/perl

    use strict;
    use warnings;

    use Bio::Seq::RichSeq;
    use Bio::SeqFeature::Generic;
    use IO::String;
    use Bio::SeqIO;

    my $seq=Bio::Seq::RichSeq->new(-display_id=>'TEST', -seq=>'actg');
    my $cds=Bio::SeqFeature::Generic->new(-primary_tag=>'CDS', -start=>1, -end=>4);
    $cds->add_tag_value(note=>'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X');
    $seq->add_SeqFeature($cds);

    my $string;
    my $str=IO::String->new($string);
    my $io=Bio::SeqIO->new(-fh=>$str, -format=>'embl');
    $io->write_seq($seq);
    print $string;

Changing the position of the space in the note makes a/the difference.

Maybe there is a bug lurking in the line wrapping/formatting code
somewhere...

Does this sound like a bug to anyone else?

  Best regards,

    Adam

--

-- 
                                                          Adam Sjøgren
                                                    adsj <at> novozymes.com

_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Daniel Lang | 22 Sep 20:13 2014
Picon

Parent/parent_id attribute

Hi,

I'm using bioperl 1.6.923-1 (Ubuntu Trusty package) and
Bio::DB::SeqFeature to store and manipulate GFF3 files.

I'm wondering why the "Parent" GFF3 attributes are stored as parent_id
values in the feature objects, but not returned as such in the gff3_string?

GFF3:
Chr01   transdecoder    mRNA    5216    5627    .       +       .
ID=T1.Chr01.mRNA.1;Parent=T1.Chr01.gene.1;Alias=T1.asmbl_1|m.6484,T1.ORF;Name=T1.Chr01.mRNA.1

Example debugger trace after fetching stored feature:

x $f
0  Bio::DB::SeqFeature=HASH(0x3e3a798)
   'attributes' => HASH(0x3e3a858)
      'Alias' => ARRAY(0x3e3a8b8)
         0  'T1.asmbl_1|m.6484'
         1  'T1.ORF'
      'load_id' => ARRAY(0x3e3aca8)
         0  'T1.Chr01.mRNA.1'
      'parent_id' => ARRAY(0x3e3acf0)
         0  'T1.Chr01.gene.1'
   'is_circular' => 0
   'name' => 'T1.Chr01.mRNA.1'
   'phase' => undef
   'primary_id' => 2428
   'ref' => 'Chr01'
   'score' => undef
   'source' => 'transdecoder'
   'start' => 5216
   'stop' => 5627
   'store' => Bio::DB::SeqFeature::Store::DBI::mysql=HASH(0x39b95d0)
      'class_loaded' => HASH(0x3e3a2b8)
         'Bio::DB::SeqFeature' => 1
      'dbh' => DBI::db=HASH(0x3dc1e40)
           empty hash
      'dumpdir' => '/tmp'
      'is_temp' => undef
      'namespace' => undef
      'seqfeatureclass' => 'Bio::DB::SeqFeature'
      'settings_cache' => HASH(0x3dc1d98)
         'autoindex' => 1
         'compress' => 0
         'index_subfeatures' => 1
         'serializer' => 'Storable'
      'writeable' => undef
   'strand' => 1
   'type' => 'mRNA'

x $f->gff3_string
0
"Chr01\cItransdecoder\cImRNA\cI5216\cI5627\cI.\cI+\cI.\cIName=T1.Chr01.mRNA.1;ID=2428;Alias=T1.asmbl_1%7Cm.6484,T1.ORF"

What is the best practice to store parentage? I'm currently adding an
additional "Parent" value using add_tag_value.

Or is this a bug in the version I'm using?

Best,
Daniel
--

-- 

Dr. Daniel Lang
University of Freiburg, Plant Biotechnology
Schaenzlestr. 1, D-79104 Freiburg
fax:        +49 761 203 6945
phone:      +49 761 203 6989
homepage:   http://www.plant-biotech.net/
            http://www.cosmoss.org/
e-mail:     daniel.lang <at> biologie.uni-freiburg.de

#################################################
My software never has bugs.
It just develops random features.
#################################################
Fields, Christopher J | 17 Sep 00:39 2014

Re: Whither Bio::FeatureIO?

It *might* be possible to set this up on Travis-CI independently on Bio::FeatureIO, which would be beneficial from a testing viewpoint (as we need to track what works w/ refactored FeatureIO vs what doesn’t).  

I suppose what we need to check with a refactor (master branch) is:

1) Maintaining a sane amount of compat. with Chado.  ‘Sane' meaning Bio::SF::Annotated will need to be chucked or completely reimplemented from scratch, as it is much less than sane now
2) If needed having a concurrently developed version of Chado to make it work.

It may not require much on #2 if Chado isn’t reliant on some of the less API-friendly parts of Bio::SF::Annotated (namely the heavy annotation associated with it).  

chris

On Sep 16, 2014, at 4:53 PM, George Hartzell <hartzell.george <at> gene.com> wrote:

<at> scott, do you have test setup for the GMOD stuff?

g.


On Tue, Sep 16, 2014 at 1:41 PM, Fields, Christopher J <cjfields <at> illinois.edu> wrote:
Cool!  I guess I could probably announce this as being released at some point now :)

chris

PS - I may have a decent test environment set up for longer-term evaluation, but it would be nice to see if we can get something working with travis-ci or a smoker setup, just so I can check whether the main branch refactoring is clobbering chado (as I suspect it is).  

On Sep 16, 2014, at 1:50 PM, George Hartzell <hartzell.george <at> gene.com> wrote:

Hi All,

It took a while, but I was finally able to run my little litmus test and the good news is that it appears to pass.

I modified my ansible playbook that implements the steps described in INSTALL.Chado so that it uses the version of Bio::FeatureIO that is now on CPAN instead of pulling the github master.

The resulting installation ran to completion and then was able to load the yeast gff3 file:

cp /vagrant/saccharomyces_cerevisiae.gff . gmod_gff3_preprocessor.pl --gfffile saccharomyces_cerevisiae.gff --outfile saccharomyces_cerevisiae.sorted.gff gmod_bulk_load_gff3.pl --organism yeast --gfffile saccharomyces_cerevisiae.gff.sorted

and the resulting database seems to be stitched together reasonably (though I’m not a particularly informed judge of its character).

<at> chris thanks for the help on this!!!!

g.


On Sat, Aug 30, 2014 at 9:24 PM, George Hartzell <hartzell <at> alerce.com> wrote:
Fields, Christopher J writes:
 > Just a quick update on this: I released a separate Bio::FeatureIO
 > release to CPAN that represents the code split out from the core
 > modules:
 >
 >    https://metacpan.org/pod/Bio::FeatureIO
 >
 > I had to do some cleanup to get code to work and tests passing with
 > some sanity.  A *lot* of things were not passing tests when we
 > moved this over.
 >
 > This should represent what was last working with Chado though.
 > However, I haven’t officially announced anything yet b/c I would
 > like to shake bugs out of it. Can either of you try this out on a
 > Chado run to make sure everything is up to snuff (or at least point
 > out issues)?  Time depending, I would like to get something running
 > on (for instance) Travis-CI, maybe including some optional
 > Chado-related stuff.  This would also help so that we can work on
 > merging what has been done on master so that these pass the same
 > tests.

I can't do anything until Tuesday, but will be happy to run it through
the standard Chado build process when I get back to work.

Thanks for digging into it.

g.




_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Matthew Laird | 16 Sep 23:16 2014
Picon
Picon

GenBank files CONTIG line

Good afternoon,

I wanted to report what I think is an issue but I'm not positive yet.  I 
found this old mailing list posting from May 
(http://lists.open-bio.org/pipermail/bioperl-l/2014-May/071583.html) 
about the changes to NCBI's genbank files, and I just grabbed the latest 
bioperl live with August's patch to hopefully solve it.  That part 
worked great, instead of spewing a few GB of warns and the whole 
sequence multiple times it read the genbank file and wrote out an embl 
file perfectly fine.

However the current bioperl live created a new issue.  I have a mirror 
of NCBI's bacterial genomes directory (yes, I know, I need to move to 
the new directory structure in the next 6 months) and this pipeline 
takes the genbank file and makes the embl, ptt, faa, and fna as needed. 
  This usually takes seconds.  Whatever changed in bioperl live compared 
to BioPerl 1.6.922 causes the script to spin doing something very 
intensely for tens of minutes, slowly writing out the ptt file.

Simply copying genbank.pm from bioperl live to my install directory 
solved both the CONTIG issue and kept the whole conversion process 
speedy.  So I'm happy for now, but I wanted to mention this in case it 
rings a bell with anyone on what could have changed to make parsing a 
gbk in to a ptt so much less efficient now.

Thanks.

--

-- 
Matthew Laird
Lead Software Developer, Bioinformatics
Brinkman Laboratory
Simon Fraser University, Burnaby, BC, Canada
Fields, Christopher J | 16 Sep 22:41 2014

Re: Whither Bio::FeatureIO?

Cool!  I guess I could probably announce this as being released at some point now :)

chris

PS - I may have a decent test environment set up for longer-term evaluation, but it would be nice to see if we can get something working with travis-ci or a smoker setup, just so I can check whether the main branch refactoring is clobbering chado (as I suspect it is).  

On Sep 16, 2014, at 1:50 PM, George Hartzell <hartzell.george <at> gene.com> wrote:

Hi All,

It took a while, but I was finally able to run my little litmus test and the good news is that it appears to pass.

I modified my ansible playbook that implements the steps described in INSTALL.Chado so that it uses the version of Bio::FeatureIO that is now on CPAN instead of pulling the github master.

The resulting installation ran to completion and then was able to load the yeast gff3 file:

cp /vagrant/saccharomyces_cerevisiae.gff . gmod_gff3_preprocessor.pl --gfffile saccharomyces_cerevisiae.gff --outfile saccharomyces_cerevisiae.sorted.gff gmod_bulk_load_gff3.pl --organism yeast --gfffile saccharomyces_cerevisiae.gff.sorted

and the resulting database seems to be stitched together reasonably (though I’m not a particularly informed judge of its character).

<at> chris thanks for the help on this!!!!

g.


On Sat, Aug 30, 2014 at 9:24 PM, George Hartzell <hartzell <at> alerce.com> wrote:
Fields, Christopher J writes:
 > Just a quick update on this: I released a separate Bio::FeatureIO
 > release to CPAN that represents the code split out from the core
 > modules:
 >
 >    https://metacpan.org/pod/Bio::FeatureIO
 >
 > I had to do some cleanup to get code to work and tests passing with
 > some sanity.  A *lot* of things were not passing tests when we
 > moved this over.
 >
 > This should represent what was last working with Chado though.
 > However, I haven’t officially announced anything yet b/c I would
 > like to shake bugs out of it. Can either of you try this out on a
 > Chado run to make sure everything is up to snuff (or at least point
 > out issues)?  Time depending, I would like to get something running
 > on (for instance) Travis-CI, maybe including some optional
 > Chado-related stuff.  This would also help so that we can work on
 > merging what has been done on master so that these pass the same
 > tests.

I can't do anything until Tuesday, but will be happy to run it through
the standard Chado build process when I get back to work.

Thanks for digging into it.

g.


_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Dmitry Karasik | 12 Sep 20:16 2014

memory leak in Bio::Species

Dear all,

I've hit a memory leak issue that OOMs our daemons once in a while, and what is
worse, I don't have the BioPerl expertise to fix it (I would send a patch
otherwise ). The problem is that Bio::Species and/or the modules it uses forms
cyclic references, which are never killed by perl automatically. I guess
either some internal data structure has to be reworked, or there should be some
strategic placing of Scalar::Util::weaken, but I have no idea where (or rather,
I could devise a hack that hammers weaken() instantly, but I don't think
this is the right approach).

It's very simple to reproduce, f.ex. by this:

	use Devel::Cycle;
	use Bio::Species;
	find_cycle(Bio::Species->new(-classification => ['A']));

which outputs

	Cycle (1):
           $Bio::Species::A->{'taxon'} => \%Bio::Taxon::B               
         $Bio::Taxon::B->{'_ancestor'} => \%Bio::Taxon::C               
             $Bio::Taxon::C->{'_desc'} => \%D                           
                             $D->{'1'} => \%Bio::Taxon::B               

	Cycle (2):
            $Bio::Species::A->{'tree'} => \%Bio::Tree::Tree::E          
        $Bio::Tree::Tree::E->{'_rootnode'} => \%Bio::Taxon::C               
             $Bio::Taxon::C->{'_desc'} => \%D                           
                             $D->{'1'} => \%Bio::Taxon::B               
         $Bio::Taxon::B->{'_ancestor'} => \%Bio::Taxon::C     

whereas I would expect it would print nothing.

I should really much like to ask the devs for a closer look. It's here on
github: https://github.com/bioperl/bioperl-live/issues/81

Thank you in advance!

--

-- 
Sincerely,
	Dmitry Karasik
D. Joe | 12 Sep 19:10 2014

web site?


Went to look for the web site for the first time in a while, seemed down. 
Checked isitdownforeveryoneorjustme and it's down for them, too.

Tried both bioperl.org (as advertized for the project at Github,
which is still up) and www.bioperl.org.

DNS doesnt' look set to expire until December 2014, so that doesn't seem to
be it.  

www.bioperl.org resolves for me to 54.243.166.98, as does www.open-bio.org
(which is also down, so it's not, for instance, just a problem with the
bioperl-specific web server settings)

mailman.open-bio.org resolves to 54.243.246.167 and seems to be up, so I
expect this message will go through at least (since the MX for bioperl.org
points to lists.open-bio.org which resolves to the same as
mailman.open-bio.org).

Not sure if this is the best place to report it, but there it is.

Hope that helps,

--

-- 
D. Joe
man screen | grep -A2 weird
  A weird imagination is most useful to gain full advantage of
  all the features.

Gmane