João Rodrigues | 1 Dec 2007 21:02
Picon
Gravatar

[BioPython] GenBank and raw_input()

Hello all!

I'm trying to code a small function that uses the GenBank.search_for()
method but I can't get it to work with raw_input(). I tried using input and
then converting to str, tried to create a raw string and then concatenate
with my raw_input string, nothing works.. I keep having an error in the
urllib2 (probably because the link isn't properly built).

Any ideas?

Thanks in advance!

João Rodrigues

_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Peter | 3 Dec 2007 13:26
Picon
Picon

Re: [BioPython] GenBank and raw_input()

On Dec 1, 2007 8:02 PM, João Rodrigues <anaryin <at> gmail.com> wrote:
> Hello all!
>
> I'm trying to code a small function that uses the GenBank.search_for()
> method but I can't get it to work with raw_input(). I tried using input and
> then converting to str, tried to create a raw string and then concatenate
> with my raw_input string, nothing works.. I keep having an error in the
> urllib2 (probably because the link isn't properly built).
>
> Any ideas?

Can you get GenBank.search_for() to work fine with a predefined search
term?

When you are using raw_input() to get the user to type in some search
terms, have you tried stripping off any whitespace (new lines, spaces)
as that might cause problems.

If you could show us a short example that doesn't work it would be
easier to try and help.

Peter

_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Matthew Neilson | 3 Dec 2007 16:32
Favicon

[BioPython] Biopython and sequence trace files...

Hi,

This question might be better suited for the development list, but here goes
anyway.  Are there any facilities in Biopython to read/write information
from sequencing trace files (e.g., .abi, .scf, .ztr, etc).  I know that
Bioperl has a way of utilizing the Staden io_lib, and I was hoping for the
same thing in Python.  Has anyone been able to convert io_lib into Python
module, or could someone point me towards resources that would help me to do
this?  Thanks in advance.

-Matt

--

-- 
Matt Neilson
Graduate Research Assistant
Great Lakes Genetics Lab
Lake Erie Center-University of Toledo
6200 Bayshore Rd.
Oregon, OH 43618

Lab: (419) 530-8370
Fax: (419) 530-8399
matthew.neilson <at> utoledo.edu
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Tiago Antao | 3 Dec 2007 22:48
Picon
Gravatar

[BioPython] Population genetics code example application

Hi,

For anyone interested, we have developed a selection detection application 
based on the code that is currently available in the PopGen code.
You can find it here: http://popgen.eu/soft/selwb/
It is actually a Jython application.
In fact the code developed for this application served as the base for 
what is now the PopGen module (still, a very small module, but coalescent 
simulation and basic statistics are on the way).

Any problems with the application, just send me an email,
Tiago
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Luca Beltrame | 4 Dec 2007 11:19
Picon

[BioPython] Adding new database types to EUtils

Hello. 
I've been trying to use EUtils to do run some queries through NCBI, but 
apparently GEO isn't present in the database list defined by EUtils:

In [8]: EUtils.databases
Out[8]:
{'gene': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422ecc>,
 'genome': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422e0c>,
 'journals': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422dec>,
 'nucleotide': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422e2c>,
 'omim': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422dcc>,
 'popset': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422e6c>,
 'protein': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422e4c>,
 'pubmed': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422dac>,
 'sequences': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422e8c>,
 'unigene': <Bio.EUtils.Config.DatabaseInfo instance at 0x8422eac>}

Therefore queries using the DBIdsClient method search() trying to use GEO, 
such as this one:

from Bio.EUtils import DBIdsClient
client = DBIdsClient.DBIdsClient()
test_search = client.search("GSE4830",db="geo")

will fail with KeyError (because it's not defined).

How can I extend EUtils.databases to add support for GEO? I've looked a bit at 
the class definitions in the API, and I'm not sure on how to proceed. Any 
hints would be greatly appreciated.

(Continue reading)

Peter | 4 Dec 2007 12:16
Picon
Picon

Re: [BioPython] Adding new database types to EUtils

> Hello.
> I've been trying to use EUtils to do run some queries through NCBI, but
> apparently GEO isn't present in the database list defined by [Biopython's] EUtils:

I guess the first thing to do is double check that the NCBI EUtils API
will support
GEO files, and then see if you can manage to fetch anything "by hand".

It is very simple to construct a URL by hand to fetch a GEO file
directly (bypassing EUtils).

Once you have downloaded the GEO files, what do you plan to do with them?
Biopython's GEO parser is very basic...

Peter

P.S. If you use R/BioConductor, I would recommend Sean Davis' GEOquery
for this sort of thing.
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Luca Beltrame | 4 Dec 2007 12:21
Picon

Re: [BioPython] Adding new database types to EUtils

Il Tuesday 04 December 2007 12:16:36 Peter ha scritto:

> Once you have downloaded the GEO files, what do you plan to do with them?
> Biopython's GEO parser is very basic...

It was mostly to check their basic description to see if they were feasible to 
be included in my current work. As I have a large list of accessions, 
fetching them all at once would reduce the time needed to go through them. To 
be more clear, downloading their summary. 

> P.S. If you use R/BioConductor, I would recommend Sean Davis' GEOquery
> for this sort of thing.

I mostly use it when I need to download data set information and expression 
levels. For this simpler task, I turned to Python first as GEOquery has some 
performance issues on my machine.

I'll take a look at NCBI's EUils and see if they support GEO. Thanks for the 
tip.

--

-- 
Luca Beltrame, MSc. - Molecular Medicine PhD Student
Dipartimento di Scienze e Tecnologie Biomediche - UniMI
CNR - Institute of Biomedical Technologies Research Fellow
E-mail: luca.beltrame <at> unimi.it - Phone: +39-02-50320924
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython
(Continue reading)

Sean Davis | 4 Dec 2007 14:35
Picon

Re: [BioPython] Adding new database types to EUtils

On Dec 4, 2007 6:21 AM, Luca Beltrame <luca.beltrame <at> unimi.it> wrote:

> Il Tuesday 04 December 2007 12:16:36 Peter ha scritto:
>
> > Once you have downloaded the GEO files, what do you plan to do with
> them?
> > Biopython's GEO parser is very basic...
>
> It was mostly to check their basic description to see if they were
> feasible to
> be included in my current work. As I have a large list of accessions,
> fetching them all at once would reduce the time needed to go through them.
> To
> be more clear, downloading their summary.
>
> > P.S. If you use R/BioConductor, I would recommend Sean Davis' GEOquery
> > for this sort of thing.
>
> I mostly use it when I need to download data set information and
> expression
> levels. For this simpler task, I turned to Python first as GEOquery has
> some
> performance issues on my machine.
>
> I'll take a look at NCBI's EUils and see if they support GEO. Thanks for
> the
> tip.

Thought I would chime in here.  GEOquery definitely does have some
performance issues, some of which I have addressed in the most recent
(Continue reading)

Luca Beltrame | 4 Dec 2007 14:49
Picon

Re: [BioPython] Adding new database types to EUtils

Il Tuesday 04 December 2007 14:35:13 hai scritto:

> release.  I have thought about making a python-based version, but I find R
> a much more compelling framework for statistical computing and array-based

I think it is mostly a matter of personal preference. I turned to Python (but 
I have been using GEOquery in the past) because I like the language more than 
R. 

> Metadata (and not values), then URLs can be constructed against their web

I guess I did not make the statement clear enough in my original mail. Yes, I 
meant to fetch only the metadata because I wanted to gather the experiment 
descriptions from all the accessions I had (a rather large number) in order 
to look through them without having to query for each one.
I will try looking at the queries via web and see if I can write something 
useful (although I still think that, as basic as it is, it would be nice to 
have EUtils GEO support in Bio.EUtils, at least for the metadata).

> I'm not sure that exactly the same functionality is available via Eutils,
> but I think not.

I have played a bit with EUtils, but I haven't yet been able to use esearch to 
work with a GEO accession. Since I have just looked at them briefly, I can't 
guarantee it was just a mistake on my part, though.

--

-- 
Luca Beltrame, MSc. - Molecular Medicine PhD Student
Dipartimento di Scienze e Tecnologie Biomediche - UniMI
CNR - Institute of Biomedical Technologies Research Fellow
(Continue reading)

Michiel de Hoon | 8 Dec 2007 04:18
Favicon

Re: [BioPython] Accessing ExPASy through Bio.SwissProt /Bio.SeqIO

Peter wrote:
> I would add a note saying doing it this way gives
> Bio.SwissProt.SProt.Record objects,
> while you could alternatively get SeqRecord objects as described in
> the SeqIO chapter
> (use a reference).

OK I will add that.
> 
> I'd suggested a Bio.SeqIO function, with a name like parse1() or
> parse_sole() etc which
> would return a single SeqRecord - and raise an error if the handle
> didn't contain one
> and only one record.  We could call this function read() if you prefer.
> 
I'd prefer read() instead of parse1(), parse_sole() etc. for the 
following reasons:

1) Having two names that are clearly different emphasizes the fact that 
they return different things (parse() returns an iterator, read() a record).

2) Some modules deal with data that always consist of one record (for 
example, gene expression data in case of Bio.Cluster). Such modules can 
have a read() function but not a parse(). It would feel strange if a 
module has a parse1() function but not a parse().

--Michiel.
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython
(Continue reading)


Gmane