Michiel de Hoon | 4 Sep 2011 08:09
Picon
Favicon

Bio.GenBank

Dear all,

Currently, Bio/GenBank/__init__.py imports Bio.ParserSupport but uses very little of it. Therefore I
would like to suggest to remove this dependency on ParserSupport from Bio/GenBank/__init__.py. I
copied the corresponding patch below.
Any objections, anybody?

Best,
--Michiel

diff --git a/Bio/GenBank/__init__.py b/Bio/GenBank/__init__.py
index 43c10d4..df38abe 100644
--- a/Bio/GenBank/__init__.py
+++ b/Bio/GenBank/__init__.py
 <at>  <at>  -47,7 +47,6  <at>  <at>  import re

 # other Biopython stuff
 from Bio import SeqFeature
-from Bio.ParserSupport import AbstractConsumer
 from Bio import Entrez

 # other Bio.GenBank stuff
 <at>  <at>  -389,7 +388,7  <at>  <at>  class RecordParser(object):
         self._scanner.feed(handle, self._consumer)
         return self._consumer.data

-class _BaseGenBankConsumer(AbstractConsumer):
+class _BaseGenBankConsumer(object):
     """Abstract GenBank consumer providing useful general functions.

(Continue reading)

Peter Cock | 5 Sep 2011 12:04
Gravatar

Re: Bio.GenBank

On Sun, Sep 4, 2011 at 7:09 AM, Michiel de Hoon <mjldehoon <at> yahoo.com> wrote:
> Dear all,
>
> Currently, Bio/GenBank/__init__.py imports Bio.ParserSupport
> but uses very little of it. Therefore I would like to suggest to
> remove this dependency on ParserSupport from
> Bio/GenBank/__init__.py. I copied the corresponding patch below.
> Any objections, anybody?

Hi Michiel,

I'd have to dig into the code to understand the patch, but
I presume there is a follow up question coming - can we
then deprecate Bio.ParserSupport since right now only the
GenBank and "pending deprecation" plain text BLAST
parsers use it (plus Compass which you recently fixed)?

Peter
Michiel de Hoon | 5 Sep 2011 13:08
Picon
Favicon

Re: Bio.GenBank

Hi Peter,

> I'd have to dig into the code to understand the patch, but
> I presume there is a follow up question coming - can we
> then deprecate Bio.ParserSupport since right now only the
> GenBank and "pending deprecation" plain text BLAST
> parsers use it (plus Compass which you recently fixed)?

Yes. With this patch, the plain text BLAST parser is the last piece of code that uses Bio.ParserSupport.

Best,
--Michiel.

--- On Mon, 9/5/11, Peter Cock <p.j.a.cock <at> googlemail.com> wrote:

> From: Peter Cock <p.j.a.cock <at> googlemail.com>
> Subject: Re: [Biopython-dev] Bio.GenBank
> To: "Michiel de Hoon" <mjldehoon <at> yahoo.com>
> Cc: biopython-dev <at> biopython.org
> Date: Monday, September 5, 2011, 6:04 AM
> On Sun, Sep 4, 2011 at 7:09 AM,
> Michiel de Hoon <mjldehoon <at> yahoo.com>
> wrote:
> > Dear all,
> >
> > Currently, Bio/GenBank/__init__.py imports
> Bio.ParserSupport
> > but uses very little of it. Therefore I would like to
> suggest to
> > remove this dependency on ParserSupport from
(Continue reading)

Peter Cock | 7 Sep 2011 14:58
Gravatar

Re: Bio.GenBank

On Mon, Sep 5, 2011 at 12:08 PM, Michiel de Hoon wrote:
> Hi Peter,
>
>> I'd have to dig into the code to understand the patch, but
>> I presume there is a follow up question coming - can we
>> then deprecate Bio.ParserSupport since right now only the
>> GenBank and "pending deprecation" plain text BLAST
>> parsers use it (plus Compass which you recently fixed)?
>
> Yes. With this patch, the plain text BLAST parser is the last
> piece of code that uses Bio.ParserSupport.

I'm OK with modifying Bio.GenBank not to depend on
Bio.ParserSupport, and if you want to adding an "obsolete"
comment or more explicitly a PendingDeprecationWarning
to Bio.ParserSupport seems sensible too.

Peter
Michiel de Hoon | 7 Sep 2011 15:53
Picon
Favicon

Bio.File

Hi all,

Bio.File makes three classes available:
Bio.File.UndoHandle
Bio.File.StringHandle (which simply points to StringIO.StringIO)
Bio.File.SGMLStripper (which has a pending deprecation warning)

Bio.File.StringHandle is currently used only in Bio.Blast.NCBIStandalone and Bio.ParserSupport,
both of which now have a pending deprecation warning.

Bio.File.UndoHandle is used in three modules that now have a pending deprecation warning
(Bio.Blast.NCBIStandalone, Bio.ParserSupport, Bio.UniGene.UniGene), as well as in
Bio.SCOP.__init__. I don't know why the UndoHandle is used in that module; the relevant code looks like this:

def _open(cgi, params={}, get=1):
    ...
    handle = urllib.urlopen(cgi, options)
    uhandle = File.UndoHandle(handle)
    return uhandle

If there is no pressing reason for using File.UndoHandle here and we can remove it, then we could add a
PendingDeprecationWarning to Bio.File.

Best,
--Michiel.
Peter Cock | 7 Sep 2011 16:36
Gravatar

Re: Bio.File

On Wed, Sep 7, 2011 at 2:53 PM, Michiel de Hoon <mjldehoon <at> yahoo.com> wrote:
> Hi all,
>
> Bio.File makes three classes available:
> Bio.File.UndoHandle
> Bio.File.StringHandle (which simply points to StringIO.StringIO)
> Bio.File.SGMLStripper (which has a pending deprecation warning)
>
> Bio.File.StringHandle is currently used only in
> Bio.Blast.NCBIStandalone and Bio.ParserSupport,
> both of which now have a pending deprecation warning.

We can just switch them to use StringIO directly, and immediately
deprecate Bio.File.StringHandle.

We can probably deprecate SGMLStripper now as well (which
means indirectly deprecating the bit of Bio.ParserSupport
which uses it).

> Bio.File.UndoHandle is used in three modules that now have a
> pending deprecation warning (Bio.Blast.NCBIStandalone,
> Bio.ParserSupport, Bio.UniGene.UniGene), as well as in
> Bio.SCOP.__init__. I don't know why the UndoHandle is
> used in that module; the relevant code looks like this:
>
> def _open(cgi, params={}, get=1):
>    ...
>    handle = urllib.urlopen(cgi, options)
>    uhandle = File.UndoHandle(handle)
>    return uhandle
(Continue reading)

Michiel de Hoon | 8 Sep 2011 16:35
Picon
Favicon

Re: Bio.File

--- On Wed, 9/7/11, Peter Cock <p.j.a.cock <at> googlemail.com> wrote:
> > Bio.File.StringHandle is currently used only in
> > Bio.Blast.NCBIStandalone and Bio.ParserSupport,
> > both of which now have a pending deprecation warning.
> 
> We can just switch them to use StringIO directly, and
> immediately
> deprecate Bio.File.StringHandle.
> 
> We can probably deprecate SGMLStripper now as well (which
> means indirectly deprecating the bit of Bio.ParserSupport
> which uses it).
> 
OK, done.

--Michiel.
Michiel de Hoon | 8 Sep 2011 16:49
Picon
Favicon

Re: Bio.File


--- On Wed, 9/7/11, Peter Cock <p.j.a.cock <at> googlemail.com> wrote:

> UndoHandle used to be used in Bio.Entrez for spotting
> error conditions, but now we trust the NCBI to set an
> HTTP return code:
> 
> https://github.com/biopython/biopython/commit/2c4d8b99fc1b2dffa726e7d9956d766f7013164d

No we shouldn't rely an HTTP return code. The idea is that only the parser can know if the output returned by
NCBI is valid, as in:

handle = Entrez.efetch(...something...)
try:
    record = Entrez.read(handle)
raise Exception:
    # NCBI returned something invalid, or at least
    # something that we don't know how to parse

> If the server could be relied on to always give an
> HTTP error code this wouldn't be needed:
> 
> https://github.com/peterjc/biopython/blob/togows/Bio/TogoWS/__init__.py
> 

I don't like this approach much, as it depends on exactly what the error message looks like, and misses any
other problems, such as incomplete output. There will be a certain false positive rate, with return
values that pass the checking of the first 10 lines but are still unusable. Even worse, the false positive
rate can suddenly go up if the server maintainers decide to change anything in their error messages. This
kind of checking should be done by the parser, which can tell you exactly if the data are valid, or if not,
(Continue reading)

Andrea Pierleoni | 8 Sep 2011 16:47
Picon
Favicon

Biograpy 1.0 beta released

Hi,
one year ago we were talking about a library I was developing basically to
draw
seqrecord in a similar way to the BioPerl Bio::Graphics module.
Today, I'm releasing the public beta version of that software that is much
more mature than
one year ago. The library is called BioGraPy and is based on matplotlib for
drawings and on biopython objects for input.
Basically you can give to biography a SeqRecord and it will draw it and
save it
in any of the matplotlib supported formats (including png, SVG and PDF).
But you can use it also at a lower level deciding exactly how and were to
plot
every feature also building very complex drawings. It comes with
integrated help for web usage, such as clickable SVG and html maps.
BioGraPy also supports continuous feature such as an hydrophobicity plot and
seqrecord per-letter annotations (if numerical).
All the code is documented with sphinx, and I'm also completing a comprensive
tutorial. The source code and the documentation are available at:

http://apierleoni.github.com/BioGraPy/

BioGraPy is released under the LGPL license.
This is an open project, so anyone willing to contribute, test or simply
suggest
improvements is welcome.
You cannot plot circular drawings from Biograpy, but you have GenomeDiagram
for that.
I hope (and think) this will be useful, significantly extending the
biopython plotting
(Continue reading)

Peter Cock | 8 Sep 2011 17:25
Gravatar

Re: Bio.File

On Thu, Sep 8, 2011 at 3:49 PM, Michiel de Hoon <mjldehoon <at> yahoo.com> wrote:
>
> No we shouldn't rely an HTTP return code. The idea is that only
> the parser can know if the output returned by NCBI is valid, as in:
>
> handle = Entrez.efetch(...something...)
> try:
>    record = Entrez.read(handle)
> raise Exception:
>    # NCBI returned something invalid, or at least
>    # something that we don't know how to parse

In theory, yes, but quite often parsers look for certain
patterns and if you feed them something else they may
just say "no data". For example, the GenBank parser
ignores anything before the LOCUS line (in order to
cope with the free text header in the large multi-record
files on the NCBI FTP site). As a side effect, you can
give it almost any plain text file and the parser won't
raise an error - it will just say no GenBank records
found.

>> If the server could be relied on to always give an
>> HTTP error code this wouldn't be needed:
>>
>> https://github.com/peterjc/biopython/blob/togows/Bio/TogoWS/__init__.py
>>
>
> I don't like this approach much, as it depends on exactly
> what the error message looks like, and misses any other
(Continue reading)


Gmane