Cedar McKay | 1 Oct 2009 01:14

Re: [Biopython] get back raw records with SeqIO?

> I hoped you would be - our mailing list discussion earlier in the year
> basically triggered including this in Biopython:
> http://lists.open-bio.org/pipermail/biopython/2009-June/005281.html
>
> Were you able to update your script using the precursor index code
> to use the new Bio.SeqIO.index function? It should have been a drop
> in replacement ;)

My head isn't at that code at the moment, but I'll try to give it a  
whirl next week.

> Why do you want to do this? I'd like to understand the desired usage.
I didn't have a specific technical reason. It just seemed like  
everything was going towards using SeqIO and things like Bio.Fasta  
were being deprecated, so I wanted to get ahead of the curve there.  
But if Bio.Genbank is going to be around for a long time, I don't have  
any problem with doing it that way.

Thanks again.

C

_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Peter | 1 Oct 2009 10:06
Picon
Picon

Re: [Biopython] get back raw records with SeqIO?

On Thu, Oct 1, 2009 at 12:14 AM, Cedar McKay <cmckay <at> u.washington.edu> wrote:
>
>> Why do you want to do this? I'd like to understand the desired
>> usage.
>
> I didn't have a specific technical reason.

OK - if you come up with a good use case example, please let us know.

> It just seemed like everything was going towards using SeqIO and things
> like Bio.Fasta were being deprecated, so I wanted to get ahead of the
> curve there. But if Bio.Genbank is going to be around for a long time,
> I don't have any problem with doing it that way.

For more complicated file formats (e.g. GenBank, SwissProt, ACE,
PHRED, ...) mapping the data into SeqRecord objects isn't 100%
perfect. Here Bio.SeqIO really is just a unifing API sitting on top
of file format specific parsers (which live in other modules), which
is good enough for most tasks. Unless/until the SeqRecord objects
are a full mapping, any more file format specific data-structure still
has its uses - and thus I see no immediate pressure to remove
Bio.GenBank etc.

Unlike some of the Bio.SeqIO parsers, for "fasta" we don't use
an underlying module (such as Bio.Fasta), and the SeqRecord
can capture all of the annotation in the raw file. One reason
for this is at the time, Bio.Fasta still used Martel and was
noticeably slower than the pure python code I adopted for
FASTA files in SeqIO. Since then Bio.Fasta has lost all the
Martel dependencies (which meant the loss of the old indexing
(Continue reading)

Denzel Li | 5 Oct 2009 19:38
Picon

[Biopython] Combine nexus files but not concatenating them

Hi all:
I notice there is a solution for combining nexus files as appeared in the
cookbook
(http://biopython.org/wiki/Concatenate_nexus ).  However, in the example the
alignments are concatenated. What if I want is, for example, the following
two files are combined into one file as shown in "combinedFile.nex".

# file1.nex
b1 GGG
b2 GGT

# file2.nex
b1 AAA
b2 AAT

# combinedFile.nex
begin data;
  dimensions ntax=2 nchar=6
[alignment from file1.nex]
b1 GGG
b2 GGT
[alignment from file2.nex]
b1 AAA
b2 AAT
;end;

begin sets;
charset a1=1-3;
charset a2=4-6;
end;
(Continue reading)

Peter | 5 Oct 2009 21:42
Picon
Picon

Re: [Biopython] Combine nexus files but not concatenating them

On Mon, Oct 5, 2009 at 6:38 PM, Denzel Li <denzel.dz.li <at> gmail.com> wrote:
> Hi all:
> I notice there is a solution for combining nexus files as appeared in the
> cookbook
> (http://biopython.org/wiki/Concatenate_nexus ).  However, in the example the
> alignments are concatenated. What if I want is, for example, the following
> two files are combined into one file as shown in "combinedFile.nex".

I was under the impression that NEXUS files should only hold
one alignment matrix. Why do you need it done this way? Isn't
your example basically the same thing but interleaved?

Peter

_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Denzel Li | 5 Oct 2009 22:00
Picon

Re: [Biopython] Combine nexus files but not concatenating them

Hi Peter:
Yes, it is basically the same thing returned by "nexus.combine" but
"interleaved".  A further question is that, is it possible to split one
nexus into several nexus according to the Charset (or partition) defined in
the file. Like in the concatenation example (
http://biopython.org/wiki/Concatenate_nexus ), split the combined file into
btCOI.nex,btCOII.nex and btITS.nex.

Thanks,
Denzel

On Mon, Oct 5, 2009 at 3:42 PM, Peter <biopython <at> maubp.freeserve.co.uk>wrote:

> On Mon, Oct 5, 2009 at 6:38 PM, Denzel Li <denzel.dz.li <at> gmail.com> wrote:
> > Hi all:
> > I notice there is a solution for combining nexus files as appeared in the
> > cookbook
> > (http://biopython.org/wiki/Concatenate_nexus ).  However, in the example
> the
> > alignments are concatenated. What if I want is, for example, the
> following
> > two files are combined into one file as shown in "combinedFile.nex".
>
> I was under the impression that NEXUS files should only hold
> one alignment matrix. Why do you need it done this way? Isn't
> your example basically the same thing but interleaved?
>
> Peter
>
_______________________________________________
(Continue reading)

Peter | 5 Oct 2009 22:31
Picon
Picon

Re: [Biopython] Combine nexus files but not concatenating them

On Mon, Oct 5, 2009 at 9:00 PM, Denzel Li <denzel.dz.li <at> gmail.com> wrote:
> Hi Peter:
> Yes, it is basically the same thing returned by "nexus.combine" but
> "interleaved".

Surely whether or not the data is interleaved is immaterial to the
meaning. Does the combined version following our wiki not work
for some 3rd party tool?

> A further question is that, is it possible to split one nexus
> into several nexus according to the Charset (or partition)
> defined in the file. Like in the concatenation example
> (http://biopython.org/wiki/Concatenate_nexus ), split the
> combined file into btCOI.nex,btCOII.nex and btITS.nex.

Does the write_nexus_data_partitions() method of the Nexus
object do what you want?

Peter
_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Austin Davis-Richardson | 6 Oct 2009 23:07
Picon
Gravatar

[Biopython] Skipping over blank/erroneous Entrez.esummary() results

Howdy,

I'm using BioPython to generate a table of accession numbers and their
corresponding TaxIDs.  The fastest way I can do this is 20 at a time
(20 per 3 seconds rather than 1 per 3 seconds).

However, this results in a problem.

whenever my script receives a result from NCBI that is blank such as
there being no value for TaxID, BioPython crashes with the error:

  File "taxcollector3.py", line 39, in getTaxID
    record = Entrez.read(handle)
  File "/Users/audy/Downloads/biopython-1.52/build/lib.macosx-10.6-universal-2.6/Bio/Entrez/__init__.py",
line 259, in read
    record = handler.run(handle)
  File "/Users/audy/Downloads/biopython-1.52/build/lib.macosx-10.6-universal-2.6/Bio/Entrez/Parser.py",
line 90, in run
    self.parser.ParseFile(handle)
  File "/Users/audy/Downloads/biopython-1.52/build/lib.macosx-10.6-universal-2.6/Bio/Entrez/Parser.py",
line 191, in endElement
    value = IntegerElement(value)
ValueError: invalid literal for int() with base 10: ''

my code looks like this:  Where gids is a string of comma-separated GIDs
(I get the GIDs from the accession numbers using
eEntrez.esearch(db="nucleotide", rettype="text", term=accessions))

			handle = Entrez.esummary(db="nucleotide", id=gids)
			record = Entrez.read(handle)
(Continue reading)

Michiel de Hoon | 7 Oct 2009 04:11
Picon
Favicon

Re: [Biopython] Skipping over blank/erroneous Entrez.esummary() results

You could try the following (with biopython 1.52):

handle = Entrez.esummary(db="nucleotide", id=gids)
records = Entrez.parse(handle)
while True:
    try:
        record = records.next()
    except StopIteration:
        break
    except:
        print "Skipping record"

We should probably modify Bio.Entrez so that empty "integer" values are treated correctly.

--Michiel.

--- On Tue, 10/6/09, Austin Davis-Richardson <harekrishna <at> gmail.com> wrote:

> From: Austin Davis-Richardson <harekrishna <at> gmail.com>
> Subject: [Biopython] Skipping over blank/erroneous Entrez.esummary() results
> To: biopython <at> lists.open-bio.org
> Date: Tuesday, October 6, 2009, 5:07 PM
> Howdy,
> 
> I'm using BioPython to generate a table of accession
> numbers and their
> corresponding TaxIDs.  The fastest way I can do this
> is 20 at a time
> (20 per 3 seconds rather than 1 per 3 seconds).
> 
(Continue reading)

Peter | 7 Oct 2009 11:29
Picon
Picon

Re: [Biopython] Combine nexus files but not concatenating them

On Wed, Oct 7, 2009 at 4:22 AM, Denzel Li <denzel.dz.li <at> gmail.com> wrote:
> Hi Peter:
> Thank you for the help. Both functions work well. By the way, will
> "standard" datatype or "mixed" datatype be supported in Bio:Nexus:Nexus?
>
> Best,
> Denzel

Hi Denzel,

I CC'd the list - please try and keep replies send there.

I'm glad Bio.Nexus is working well for you.

Regarding the finer details of the NEXUS file format and the Biopython
code, I am not an expert - we need Frank or Cymon to comment. If
you could give us a couple of examples of what you are asking for it
would probably be much clearer (to me at least).

Regards,

Peter
_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Peter | 7 Oct 2009 13:17
Picon
Picon

Re: [Biopython] Skipping over blank/erroneous Entrez.esummary() results

On Wed, Oct 7, 2009 at 3:11 AM, Michiel de Hoon <mjldehoon <at> yahoo.com> wrote:
>
> We should probably modify Bio.Entrez so that empty "integer" values are treated correctly.
>

Does "correctly" mean a default value? I see Brad has just commited a change to
use -1 in this case, but perhaps None is also a good choice? Can we
alternatively
leave this bit of the data structure empty?

Peter
_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


Gmane