Fields, Christopher J | 17 Sep 00:39 2014

Re: Whither Bio::FeatureIO?

It *might* be possible to set this up on Travis-CI independently on Bio::FeatureIO, which would be beneficial from a testing viewpoint (as we need to track what works w/ refactored FeatureIO vs what doesn’t).  

I suppose what we need to check with a refactor (master branch) is:

1) Maintaining a sane amount of compat. with Chado.  ‘Sane' meaning Bio::SF::Annotated will need to be chucked or completely reimplemented from scratch, as it is much less than sane now
2) If needed having a concurrently developed version of Chado to make it work.

It may not require much on #2 if Chado isn’t reliant on some of the less API-friendly parts of Bio::SF::Annotated (namely the heavy annotation associated with it).  

chris

On Sep 16, 2014, at 4:53 PM, George Hartzell <hartzell.george <at> gene.com> wrote:

<at> scott, do you have test setup for the GMOD stuff?

g.


On Tue, Sep 16, 2014 at 1:41 PM, Fields, Christopher J <cjfields <at> illinois.edu> wrote:
Cool!  I guess I could probably announce this as being released at some point now :)

chris

PS - I may have a decent test environment set up for longer-term evaluation, but it would be nice to see if we can get something working with travis-ci or a smoker setup, just so I can check whether the main branch refactoring is clobbering chado (as I suspect it is).  

On Sep 16, 2014, at 1:50 PM, George Hartzell <hartzell.george <at> gene.com> wrote:

Hi All,

It took a while, but I was finally able to run my little litmus test and the good news is that it appears to pass.

I modified my ansible playbook that implements the steps described in INSTALL.Chado so that it uses the version of Bio::FeatureIO that is now on CPAN instead of pulling the github master.

The resulting installation ran to completion and then was able to load the yeast gff3 file:

cp /vagrant/saccharomyces_cerevisiae.gff . gmod_gff3_preprocessor.pl --gfffile saccharomyces_cerevisiae.gff --outfile saccharomyces_cerevisiae.sorted.gff gmod_bulk_load_gff3.pl --organism yeast --gfffile saccharomyces_cerevisiae.gff.sorted

and the resulting database seems to be stitched together reasonably (though I’m not a particularly informed judge of its character).

<at> chris thanks for the help on this!!!!

g.


On Sat, Aug 30, 2014 at 9:24 PM, George Hartzell <hartzell <at> alerce.com> wrote:
Fields, Christopher J writes:
 > Just a quick update on this: I released a separate Bio::FeatureIO
 > release to CPAN that represents the code split out from the core
 > modules:
 >
 >    https://metacpan.org/pod/Bio::FeatureIO
 >
 > I had to do some cleanup to get code to work and tests passing with
 > some sanity.  A *lot* of things were not passing tests when we
 > moved this over.
 >
 > This should represent what was last working with Chado though.
 > However, I haven’t officially announced anything yet b/c I would
 > like to shake bugs out of it. Can either of you try this out on a
 > Chado run to make sure everything is up to snuff (or at least point
 > out issues)?  Time depending, I would like to get something running
 > on (for instance) Travis-CI, maybe including some optional
 > Chado-related stuff.  This would also help so that we can work on
 > merging what has been done on master so that these pass the same
 > tests.

I can't do anything until Tuesday, but will be happy to run it through
the standard Chado build process when I get back to work.

Thanks for digging into it.

g.




_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Matthew Laird | 16 Sep 23:16 2014
Picon
Picon

GenBank files CONTIG line

Good afternoon,

I wanted to report what I think is an issue but I'm not positive yet.  I 
found this old mailing list posting from May 
(http://lists.open-bio.org/pipermail/bioperl-l/2014-May/071583.html) 
about the changes to NCBI's genbank files, and I just grabbed the latest 
bioperl live with August's patch to hopefully solve it.  That part 
worked great, instead of spewing a few GB of warns and the whole 
sequence multiple times it read the genbank file and wrote out an embl 
file perfectly fine.

However the current bioperl live created a new issue.  I have a mirror 
of NCBI's bacterial genomes directory (yes, I know, I need to move to 
the new directory structure in the next 6 months) and this pipeline 
takes the genbank file and makes the embl, ptt, faa, and fna as needed. 
  This usually takes seconds.  Whatever changed in bioperl live compared 
to BioPerl 1.6.922 causes the script to spin doing something very 
intensely for tens of minutes, slowly writing out the ptt file.

Simply copying genbank.pm from bioperl live to my install directory 
solved both the CONTIG issue and kept the whole conversion process 
speedy.  So I'm happy for now, but I wanted to mention this in case it 
rings a bell with anyone on what could have changed to make parsing a 
gbk in to a ptt so much less efficient now.

Thanks.

--

-- 
Matthew Laird
Lead Software Developer, Bioinformatics
Brinkman Laboratory
Simon Fraser University, Burnaby, BC, Canada
Fields, Christopher J | 16 Sep 22:41 2014

Re: Whither Bio::FeatureIO?

Cool!  I guess I could probably announce this as being released at some point now :)

chris

PS - I may have a decent test environment set up for longer-term evaluation, but it would be nice to see if we can get something working with travis-ci or a smoker setup, just so I can check whether the main branch refactoring is clobbering chado (as I suspect it is).  

On Sep 16, 2014, at 1:50 PM, George Hartzell <hartzell.george <at> gene.com> wrote:

Hi All,

It took a while, but I was finally able to run my little litmus test and the good news is that it appears to pass.

I modified my ansible playbook that implements the steps described in INSTALL.Chado so that it uses the version of Bio::FeatureIO that is now on CPAN instead of pulling the github master.

The resulting installation ran to completion and then was able to load the yeast gff3 file:

cp /vagrant/saccharomyces_cerevisiae.gff . gmod_gff3_preprocessor.pl --gfffile saccharomyces_cerevisiae.gff --outfile saccharomyces_cerevisiae.sorted.gff gmod_bulk_load_gff3.pl --organism yeast --gfffile saccharomyces_cerevisiae.gff.sorted

and the resulting database seems to be stitched together reasonably (though I’m not a particularly informed judge of its character).

<at> chris thanks for the help on this!!!!

g.


On Sat, Aug 30, 2014 at 9:24 PM, George Hartzell <hartzell <at> alerce.com> wrote:
Fields, Christopher J writes:
 > Just a quick update on this: I released a separate Bio::FeatureIO
 > release to CPAN that represents the code split out from the core
 > modules:
 >
 >    https://metacpan.org/pod/Bio::FeatureIO
 >
 > I had to do some cleanup to get code to work and tests passing with
 > some sanity.  A *lot* of things were not passing tests when we
 > moved this over.
 >
 > This should represent what was last working with Chado though.
 > However, I haven’t officially announced anything yet b/c I would
 > like to shake bugs out of it. Can either of you try this out on a
 > Chado run to make sure everything is up to snuff (or at least point
 > out issues)?  Time depending, I would like to get something running
 > on (for instance) Travis-CI, maybe including some optional
 > Chado-related stuff.  This would also help so that we can work on
 > merging what has been done on master so that these pass the same
 > tests.

I can't do anything until Tuesday, but will be happy to run it through
the standard Chado build process when I get back to work.

Thanks for digging into it.

g.


_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Dmitry Karasik | 12 Sep 20:16 2014

memory leak in Bio::Species

Dear all,

I've hit a memory leak issue that OOMs our daemons once in a while, and what is
worse, I don't have the BioPerl expertise to fix it (I would send a patch
otherwise ). The problem is that Bio::Species and/or the modules it uses forms
cyclic references, which are never killed by perl automatically. I guess
either some internal data structure has to be reworked, or there should be some
strategic placing of Scalar::Util::weaken, but I have no idea where (or rather,
I could devise a hack that hammers weaken() instantly, but I don't think
this is the right approach).

It's very simple to reproduce, f.ex. by this:

	use Devel::Cycle;
	use Bio::Species;
	find_cycle(Bio::Species->new(-classification => ['A']));

which outputs

	Cycle (1):
           $Bio::Species::A->{'taxon'} => \%Bio::Taxon::B               
         $Bio::Taxon::B->{'_ancestor'} => \%Bio::Taxon::C               
             $Bio::Taxon::C->{'_desc'} => \%D                           
                             $D->{'1'} => \%Bio::Taxon::B               

	Cycle (2):
            $Bio::Species::A->{'tree'} => \%Bio::Tree::Tree::E          
        $Bio::Tree::Tree::E->{'_rootnode'} => \%Bio::Taxon::C               
             $Bio::Taxon::C->{'_desc'} => \%D                           
                             $D->{'1'} => \%Bio::Taxon::B               
         $Bio::Taxon::B->{'_ancestor'} => \%Bio::Taxon::C     

whereas I would expect it would print nothing.

I should really much like to ask the devs for a closer look. It's here on
github: https://github.com/bioperl/bioperl-live/issues/81

Thank you in advance!

--

-- 
Sincerely,
	Dmitry Karasik
D. Joe | 12 Sep 19:10 2014

web site?


Went to look for the web site for the first time in a while, seemed down. 
Checked isitdownforeveryoneorjustme and it's down for them, too.

Tried both bioperl.org (as advertized for the project at Github,
which is still up) and www.bioperl.org.

DNS doesnt' look set to expire until December 2014, so that doesn't seem to
be it.  

www.bioperl.org resolves for me to 54.243.166.98, as does www.open-bio.org
(which is also down, so it's not, for instance, just a problem with the
bioperl-specific web server settings)

mailman.open-bio.org resolves to 54.243.246.167 and seems to be up, so I
expect this message will go through at least (since the MX for bioperl.org
points to lists.open-bio.org which resolves to the same as
mailman.open-bio.org).

Not sure if this is the best place to report it, but there it is.

Hope that helps,

--

-- 
D. Joe
man screen | grep -A2 weird
  A weird imagination is most useful to gain full advantage of
  all the features.
vandana Baranwal | 31 Aug 07:08 2014
Picon

Converting pdb files to dssp file

Hello
Is is possible to convert a pdb file int dssp file using bioperl. I searched a lot but didn't get a fruitful answer.
I want to convert approximately 1000 pdb files into dssp files.

Any help will be highly appreciated

--
Thanks & Regards
Vandana Kumari
_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Brian Osborne | 28 Aug 16:21 2014
Picon
Picon

Add to SeqFeature::Generic?

All,

Some SeqFeature modules (e.g. SeqFeature::Computation) allow things like this:

my  <at> sourcefeats = $seq->remove_SeqFeatures(‘source’)

my  <at> sourcefeats = $seq->get_SeqFeatures(‘source’)

SeqFeature::Generic can’t do this. Any objection to my adding this capability to Generic?

Thanks again,

Brian O.
—
Francisco J. Ossandón | 22 Aug 17:50 2014
Picon

Re: Proposal for bioperl-run

I have also looked 

I don't mind the moving of those modules but I would like to ask something. Any future updates to the moved
modules should also be added, if possible, to the modules that will remain v1.6.x branch? Or their
development would be freezed?? Or should they also be removed from there too??

I've been keeping the synchronicity of updated modules that were moved out of -live into their own repo with
their versions in v1.6.x branch (like Root and Coordinate), but the synch of WrapperBase when mixed into
other existing repo like Run will depend on if it start to depend on code of other Run modules or not.

On a side note, I just realized that there is an empty "hmmer3.pm" file in
"bioperl-live/Bio/Tools/Run"... It seems to have been added already empty in
https://github.com/bioperl/bioperl-live/commit/64ab09ecd40abb8cd06e9b80aafcda323d1dc47e,
maybe by mistake. It appears to be the only empty file in the repo. Should I delete it??

Cheers,

Francisco J. Ossandon

-----Mensaje original-----
De: bioperl-l-bounces+fossandonc=hotmail.com <at> mailman.open-bio.org
[mailto:bioperl-l-bounces+fossandonc=hotmail.com <at> mailman.open-bio.org] En nombre de Mark A. Jensen
Enviado el: viernes, 22 de agosto de 2014 8:09
Para: Fields, Christopher J
CC: George Hartzell; bioperl-l <at> mailman.open-bio.org
Asunto: Re: [Bioperl-l] Proposal for bioperl-run

Thanks Chris-- yes, you must be right. I just did a quick grep for "StandAloneBlast" in the modules. I will
verify and leave bl2seq alone.
Will do everything in branches and then give the signal- cheers MAJ On 2014-08-21 23:50, Fields,
Christopher J wrote:
> On Aug 21, 2014, at 9:37 PM, George Hartzell <hartzell <at> alerce.com>
> wrote:
>
>> Mark A. Jensen writes:
>>> [...]
>>> My simple proposal is to move these three modules from bioperl-live 
>>> to bioperl-run. (Only AlignIO/bl2seq.pm depends on StandAloneBlast, 
>>> btw).
>>>
>>> Thoughts?
>>> [...]
>>
>> Speaking from a safe distance, that sounds *wonderful*.
>>
>> g.
>
> Agreed.  Also, I think you mean that StandAloneBlast has a dependency 
> on AlignIO::bl2seq, not the other way around, correct? At least, I 
> didn’t see anything there.
>
> If no one objects to it (give it a day), I say go ahead and move it 
> over.
>
> chris

_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
Mark A. Jensen | 22 Aug 03:49 2014
Picon

Proposal for bioperl-run

All,
I'm starting to look at cleaning up bioperl-run so that, e.g., tests 
don't bork without a reasonable error message, are skipped 
appropriately, etc. As I do this, I'm finding that to fix certain 
issues, I need to consider mods to Bio::Tools::Run::WrapperBase and 
friends. Alas, these are in the bioperl-live distro!

Now, grepping for 'WrapperBase' in bioperl-live modules reveals only 
two that mention it:

Bio/Tools/Run/WrapperBase/CommandExts.pm
Bio/Tools/Run/StandAloneBlast.pm
Bio/Tools/Run/WrapperBase.pm

My simple proposal is to move these three modules from bioperl-live to 
bioperl-run. (Only AlignIO/bl2seq.pm depends on StandAloneBlast, btw).

Thoughts?
MAJ
Cacau Centurion | 12 Aug 21:10 2014
Picon

Extract sequences of CDS regions from Genbank formatted file

Hi all,

I was wondering if there is a way to directly extract all sequences of CDS regions from a Genbank formatted file using bioperl?

Yours,
Cacau
_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/bioperl-l
RB | 11 Aug 15:31 2014
Picon

www.ncbi.nlm.nih.gov:80 (Bad hostname)

Hello!

I'm trying to connect to a NCBI page and retrieve some information.
Basically I want to retrieve what is under "Representative" in 
http://www.ncbi.nlm.nih.gov/genome/?term=Xylella_fastidiosa
<http://www.ncbi.nlm.nih.gov/genome/?term=Xylella_fastidiosa>  . For this
I'm trying to use LWP::Simple or LWP::UserAgent, but in no way am I able to
retrieve the HTML.

Here is my code:
-------------------------------------------------------
#!/usr/local/bin/perl
use strict;
use warnings;
use autodie;
use Data::Dump;
use LWP::Simple qw(get);

my
$content=get('http://www.ncbi.nlm.nih.gov/genome/?term=Xylella_fastidiosa');

dd $content;
-------------------------------------------------------

But I get an undef from this code. I read through  this
post
<http://bioperl.996286.n3.nabble.com/Bio-Tools-Run-RemoteBlast-error-500-Can-t-connect-to-www-ncbi-nlm-nih-gov-80-td10210.html>

and here is the result of two different pings:
-------------------------------------------------------
Pinging www.wip.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 130.14.29.110:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),

-------------------------------------------------------
Pinging www.google.com [173.194.112.240] with 32 bytes of data:
Reply from 173.194.112.240: bytes=32 time=46ms TTL=53
Reply from 173.194.112.240: bytes=32 time=46ms TTL=52
Reply from 173.194.112.240: bytes=32 time=45ms TTL=53
Reply from 173.194.112.240: bytes=32 time=45ms TTL=53

Ping statistics for 173.194.112.240:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 45ms, Maximum = 46ms, Average = 45ms
-------------------------------------------------------

So I guess this means there is something wrong on my side? What am I doing
wrong? Thanks for all the help

--
View this message in context: http://bioperl.996286.n3.nabble.com/500-Can-t-connect-to-www-ncbi-nlm-nih-gov-80-Bad-hostname-tp17624.html
Sent from the Bioperl-L mailing list archive at Nabble.com.

Gmane