Michiel de Hoon | 1 Feb 2008 08:22
Picon
Favicon

Re: [BioPython] blast parse

I have added a DeprecationWarning to NCBIXML.BlastParser.parse.

--Michiel.

Christof Winter <winter <at> biotec.tu-dresden.de> wrote: Michiel de Hoon wrote:
> Dear Jose,
> 
> To get the records one-by-one, use
> 
> from Bio.Blast import NCBIXML blast_parse = NCBIXML.parse(blasth) for
> blast_result in blast_parse: # do whatever with blast_result
> 
> This avoids having to read the complete XML file all at once.
> 
> To the developers: We should probably think about removing the
> NCBIXML.BlastParser.parse, and perhaps adding a NCBIXML.read function to read
> exactly one record from the XML file.

I thinks removing NCBIXML.BlastParser.parse is a good idea.
We should keep it simple.

Christof

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

(Continue reading)

Ernesto | 5 Feb 2008 15:55
Picon
Favicon

[BioPython] GFF parser

Dear All,

I found around Internet a very interesting GFF parser written in  
Python by Martin Knudsen. Since I know that at the moment there isn't  
a real GFF parser in BioPython, we could think to add the one by  
Martin. For sure, requesting the permission to the author.
The parser can be downloaded from the following web page: http:// 
www.daimi.au.dk/~martink/birc/scripts.html

Hope this help,

Ernesto

--------------------------------------------------------
Dr Ernesto Picardi, PhD
Dept. of Biochemistry and Molecular Biology
University of Bari
Italy
E-mail: e.picardi <at> unical.it
--------------------------------------------------------

_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Chris Lasher | 6 Feb 2008 04:27
Picon
Gravatar

[BioPython] Biopython to begin transition to Subversion

Hello all Biopythonistas,

In the next upcoming weeks, Biopython will begin and complete its
transition from CVS to Subversion (SVN) as its revision control
system.

This transition will likely not affect end users of Biopython except
that to get the development version, a checkout with a Subversion
client, rather than a CVS client, will be necessary.

For developers, we will need to determine a suitable range of dates (a
week) during which we will "freeze" the CVS repository for its
transition to SVN. From the freeze and thereon, commits to the CVS
repository will no longer be possible. Instead, commits not placed in
during the freeze will need to take place in the Subversion repository
once we have it running. This week, we hope to have a "dry run" of the
Subversion repository available for the developers to poke around and
make sure the transition will include everything necessary. Following
that, we'll have the freeze and complete the transition.

If you have any questions, I'll be checking posts to the list, or you
may feel free contact me directly.

Best,
Chris
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

(Continue reading)

Chris Fields | 6 Feb 2008 04:33
Picon

Re: [BioPython] Biopython to begin transition to Subversion

Let me know if you need any help.

chris

On Feb 5, 2008, at 9:27 PM, Chris Lasher wrote:

> Hello all Biopythonistas,
>
> In the next upcoming weeks, Biopython will begin and complete its
> transition from CVS to Subversion (SVN) as its revision control
> system.
>
> This transition will likely not affect end users of Biopython except
> that to get the development version, a checkout with a Subversion
> client, rather than a CVS client, will be necessary.
>
> For developers, we will need to determine a suitable range of dates (a
> week) during which we will "freeze" the CVS repository for its
> transition to SVN. From the freeze and thereon, commits to the CVS
> repository will no longer be possible. Instead, commits not placed in
> during the freeze will need to take place in the Subversion repository
> once we have it running. This week, we hope to have a "dry run" of the
> Subversion repository available for the developers to poke around and
> make sure the transition will include everything necessary. Following
> that, we'll have the freeze and complete the transition.
>
> If you have any questions, I'll be checking posts to the list, or you
> may feel free contact me directly.
>
> Best,
(Continue reading)

Andrew Dalke | 6 Feb 2008 12:03

Re: [BioPython] [bip] Bioinformatics Programming Language Shootout, Python performance poopoo'd

On Feb 6, 2008, at 11:44 AM, Peter wrote:
> Am I right in thinking the authors have not made any of their sample
> input files available?  In the case of the multi GB Blast file, this
> is perhaps justified.  Also I didn't see any timing script.

the alignment programs contain the test data.

the fasta parser and blast parser do not contain test data.  The lack  
of data is not justified as having a 9GB file adds little to the  
comparison over having a 9 MB file as it should scale linearly.  It  
does show that the parsers can handle large files, but big whoop.   
And the test is unaffected by having a 9MB file duplicated 1,000 times.

the neighbor-joining code contains no test data

There's no timing script.

				Andrew
				dalke <at> dalkescientific.com

_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Jose Blanca | 6 Feb 2008 17:06
Picon

[BioPython] Alignment add_sequence

Hello,
I'm building an alignment object from a set of seqRecords using the following 
code:
        from Bio.Align.Generic import Alignment
        from Bio.Alphabet import IUPAC
        my_alpha = IUPAC.IUPACAmbiguousDNA()
        ali = Alignment(my_alpha)
        for seqName in sequences.keys():
            seq = sequences[seqName].seq.tostring()
            start = mesh[seqName]['location_begin']
            id = sequences[seqName].id
            ali.add_sequence(id, seq, start)
Is this the best way to do it? Everything is working as expected, but I have a 
problem with this implementation. My seqRecords have additional annotations 
and I'm loosing them. Maybe this could be solved with a new function like:
    def add_sequence(self,  seqRecord, start = None, end = None,
                     weight = 1.0):
Also in this way the we woudn't need to create a new SeqRecord for every 
sequence and it should be quicker. The result could be something like:
        from Bio.Align.Generic import Alignment
        from Bio.Alphabet import IUPAC
        my_alpha = IUPAC.IUPACAmbiguousDNA()
        ali = Alignment(my_alpha)
        for seqName in sequences.keys():
            start = mesh[seqName]['location_begin']
            ali.add_sequence(sequences[seqName], start)

With such a function a problem could appear if an annotation named 'start' 
or 'end' is already in the annotation dict. But this could be solved raising 
an expection in that case. What do you think?
(Continue reading)

Peter Cock | 6 Feb 2008 17:20
Gravatar

Re: [BioPython] Alignment add_sequence

On Feb 6, 2008 4:06 PM, Jose Blanca <jblanca <at> btc.upv.es> wrote:
> Hello,
> I'm building an alignment object from a set of seqRecords using the following
> code:
> ...
> Is this the best way to do it?

No, not really.  See below ..

> Everything is working as expected, but I have a
> problem with this implementation. My seqRecords have additional annotations
> and I'm loosing them.

Yes, using that method the alignment is creating a new SeqRecord for
each sequence with no annotation.

> Maybe this could be solved with a new function like:
>     def add_sequence(self,  seqRecord, start = None, end = None,
>                      weight = 1.0):

This has been discussed before, along with other limitations of the
current alignment class, e.g. on bug 1944
http://bugzilla.open-bio.org/show_bug.cgi?id=1944

Right now I would suggest you try the Bio.SeqIO.to_alignment()
function, although this doesn't try and do anything clever with
start/end annotation: http://biopython.org/wiki/SeqIO

Peter
_______________________________________________
(Continue reading)

Paulo Nuin | 6 Feb 2008 17:07
Gravatar

Re: [BioPython] [bip] Bioinformatics Programming Language Shootout, Python performance poopoo'd

Hi all

I am running pylint on the code and getting some evaluation.

Currently the alignment.py scored -10.16/10, mainly because of 
indentation issues and lack of spaces between operators.
NJ.py scored -7.66/10
parse.py scored -6.10/10
readFasta.py scored -7.00/10

Of course this test just measures the "Pythonic" level of the code, but 
it does not check the code itself for quality.

Cheers

Paulo

Andrew Dalke wrote:
> On Feb 6, 2008, at 11:44 AM, Peter wrote:
>> Am I right in thinking the authors have not made any of their sample
>> input files available?  In the case of the multi GB Blast file, this
>> is perhaps justified.  Also I didn't see any timing script.
>
> the alignment programs contain the test data.
>
> the fasta parser and blast parser do not contain test data.  The lack 
> of data is not justified as having a 9GB file adds little to the 
> comparison over having a 9 MB file as it should scale linearly.  It 
> does show that the parsers can handle large files, but big whoop.  And 
> the test is unaffected by having a 9MB file duplicated 1,000 times.
(Continue reading)

Colosimo, Marc E. | 6 Feb 2008 16:28
Picon
Favicon

Re: [BioPython] [bip] Bioinformatics Programming Language Shootout, Python performance poopoo'd

<rant>
What is biology in python or more to the point why is there yet another
mailing list (Web site?) for biology in python?

>From looking at their archive messages:

1. Need to establish python/biology community.....

Isn't that what BioPython is? If not, why not?

I'll also point out that there is "CoreBio" a python toolkit for
writing computational biology applications
<http://code.google.com/p/corebio/>

I don't want to subscribe to another mailing list, install another
suite of code, keep track of another Web site.
</rant> 

-----Original Message-----
From: biopython-bounces <at> lists.open-bio.org
[mailto:biopython-bounces <at> lists.open-bio.org] On Behalf Of Andrew Dalke
Sent: Wednesday, February 06, 2008 6:04 AM
To: biopython <at> lists.open-bio.org
Cc: biology-in-python <at> lists.idyll.org
Subject: Re: [BioPython] [bip] Bioinformatics Programming Language
Shootout,Python performance poopoo'd

On Feb 6, 2008, at 11:44 AM, Peter wrote:
> Am I right in thinking the authors have not made any of their sample
> input files available?  In the case of the multi GB Blast file, this
(Continue reading)

Tiago Antão | 6 Feb 2008 18:05
Picon
Gravatar

Re: [BioPython] Biopython to begin transition to Subversion

Hi,

On Feb 6, 2008 4:27 PM, Peter <biopython <at> maubp.freeserve.co.uk> wrote:
> Michiel - do you think we should try and do another release before the
> CVS freeze and migration?  We've had a lots little changes, plus
> Tiago's PopGen work and my own efforts with BioSQL.  There are still a
> few open issues, but I think a release soon would be reasonable
> (depending on your time commitments of course).

Just FYI: As I noticed that the SVN move would be happening sooner or
later, I decided to put everything into a stable state and stop at
that point.
Hopefully all that there is PopGen related is stable and ready to move
(code, test, doc).
As soon as we move to SVN I will get back into committing (now the
really interesting stuff will start: statistics and maybe HapMap).

Tiago
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


Gmane