bugzilla-daemon | 1 Nov 2007 05:35

[Bug 2351] Make Seq more like a string, even subclass string?

http://bugzilla.open-bio.org/show_bug.cgi?id=2351

------- Comment #8 from mdehoon <at> ims.u-tokyo.ac.jp  2007-11-01 00:35 EST -------
> > 3) The sequence should be mutable, so that we won't need a separate
> > MutableSeq class. This also implies that a Seq class cannot subclass from
> > string, since strings are not mutable.
> 
> Why? Python strings are not mutable, and this isn't usually a problem.
> Personally, I have never needed a mutable sequence and have only ever
> used them in test cases.

For my research, I do need a mutable sequence.

> Having the basic Seq non-mutable means we can leverage existing
> string functionality and optimizations.

Thinking this over, I can see one more pressing reason to keep the basic Seq
immutable: If Seq is immutable, it can be used as the key in a dictionary, and
as a member of a set. With a mutable Seq, neither is possible. So I guess we
need to keep both a Seq and a MutableSeq class. We'll have to write a clearer
explanation though in the tutorial as to why two classes are needed.

> Also writing a new mutable sequence in C seems like a bit maintainance load in
> the long term (and may complicate the cross platform build process).
> Surely we can get good enough performance via the array of characters
> route currently used?

The array of characters approach allows us fast modification of sequences. On
the other hand, things like taking the complement is much slower than for
strings. I looked around a bit in the Python standard library and found that
(Continue reading)

bugzilla-daemon | 1 Nov 2007 09:49

[Bug 2351] Make Seq more like a string, even subclass string?

http://bugzilla.open-bio.org/show_bug.cgi?id=2351

------- Comment #9 from biopython-bugzilla <at> maubp.freeserve.co.uk  2007-11-01 04:49 EST -------
> For my research, I do need a mutable sequence.

Diffent work, different needs.

> > Having the basic Seq non-mutable means we can leverage existing
> > string functionality and optimizations.
> 
> Thinking this over, I can see one more pressing reason to keep the
> basic Seq immutable: If Seq is immutable, it can be used as the key
> in a dictionary, and as a member of a set. With a mutable Seq, neither
> is possible. So I guess we need to keep both a Seq and a MutableSeq class. 

Those are both good points.  The dictionary key thing is something I have used,
but hadn't thought about in my last comment.

> We'll have to write a clearer explanation though in the tutorial as to
> why two classes are needed.

Fair point.

> The array of characters approach allows us fast modification of sequences.
> On the other hand, things like taking the complement is much slower than for
> strings. I looked around a bit in the Python standard library and found that
> there already is a MutableString class (located in the UserString module).
> Since this class stores a immutable string internally, it is as fast as a
> string. So how about letting the basic Seq class inherit from string, and
> the MutableSeq class from MutableString?
(Continue reading)

bugzilla-daemon | 1 Nov 2007 10:14

[Bug 2390] Error importing Swiss Prot in BioSQL

http://bugzilla.open-bio.org/show_bug.cgi?id=2390

------- Comment #6 from biopython-bugzilla <at> maubp.freeserve.co.uk  2007-11-01 05:14 EST -------
We may have two bugs here.

First of all your original problem, TypeError: not all arguments converted
during string formatting

If you could post the SQL query and the argument list (sql and arg) it might be
helpful.  We should check that the data we are trying to insert into the
database matches the fields in the table.

Then we come to the new error, AttributeError: 'Cursor' object has no attribute
'insert_id'

I found a question on our own mailing list from Bela Tiwari, 4 November 2005
which shares this new problem and may shed some light on what is going wrong:

Begin quote
-------------------------------------------------------------------
Hello,

I am new to using biopython and biosql. I have been following the information
in the document Basic BioSQL with Biopython to try and get familiar with using
biopython to work with mysql databases and specifically I have tried to load a
Genbank file containing a small bacterial genome into a database.

I believe I have carried out all the instructions correctly (i.e. interpreted
to fit the system I am working on - Debian Sarge). The code and traceback call
that results is:
(Continue reading)

bugzilla-daemon | 1 Nov 2007 10:24

[Bug 2390] Error importing Swiss Prot in BioSQL

http://bugzilla.open-bio.org/show_bug.cgi?id=2390

------- Comment #7 from biopython-bugzilla <at> maubp.freeserve.co.uk  2007-11-01 05:24 EST -------
Created an attachment (id=801)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=801&action=view)
Untested patch to BioSQL/DBUtils.py

Based on a little google searching, this is my completely untested educated
guess for how to fix the cursor problem in BioSQL/DBUtils.py with the relevant
changed bit of code below:

class Mysql_dbutils(Generic_dbutils):
    def last_id(self, cursor, table):
        try :
            #This worked on older versions of MySQL
            return cursor.insert_id()
        except AttributeError:
            #See bug 2390.
            #Google suggests this is the new way:
            return cursor.lastrowid

If you feel brave and don't know how to work with patches, just back the
original file and then edit the class Mysql_dbutils to look like the above.

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon | 5 Nov 2007 17:22

[Bug 2390] Error importing Swiss Prot in BioSQL

http://bugzilla.open-bio.org/show_bug.cgi?id=2390

------- Comment #8 from Biosql <at> hotmail.com  2007-11-05 11:22 EST -------
Hi Peter, 

Here's the SQL query : 

INSERT INTO bioentry (
         biodatabase_id,
         name,
         accession,
         identifier,
         division,
         description,
         version)
        VALUES (%s,
         %s,
         %s,
         %s,
         %s,
         %s,)"""
        self.adaptor.execute(sql, (self.dbid, record.name,
accession,identifier,
                                   division, description, version))

You can see there's a missing %s. 

There's also a similar error in DBUtils module at line 23, where an argument is
also missing. 

(Continue reading)

bugzilla-daemon | 5 Nov 2007 20:49

[Bug 2390] Error importing Swiss Prot in BioSQL

http://bugzilla.open-bio.org/show_bug.cgi?id=2390

------- Comment #9 from biopython-bugzilla <at> maubp.freeserve.co.uk  2007-11-05 14:49 EST -------
Issue One
---------
Missing %s in BioSQL/Loader.py approx line 240.  I think that as you suggest,
we should add a %s here.

This was removed in CVS version 1.18 (June 2007) by Michiel while working on
bug 1982.  I think only one %s was meant to be removed (when the taxon_id was
removed), but it got done twice with the two related CVS commits.

Issue Two
---------
The cursor problem.  I don't know if my suggestion in comment 7 is working or
not - it could be something else is going wrong to result in nothing in the
database.

Issue Three
-----------
Missing table argument in BioSQL/DBUtils.py line 23, I agree with you and have
fixed this in CVS revision 1.4.

In normal use the object is subclassed and the broken last_id method would
never have been called.

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
(Continue reading)

bugzilla-daemon | 5 Nov 2007 22:49

[Bug 2390] Error importing Swiss Prot in BioSQL

http://bugzilla.open-bio.org/show_bug.cgi?id=2390

------- Comment #10 from Biosql <at> hotmail.com  2007-11-05 16:49 EST -------
I'm using the DButils.py for now and it's working fine. 

Now, I've checked the uploading problem with the BioSQL database and it seems
to me that It's getting stuck right at the beginning of the import with this
sql query : 

INSERT INTO biodatabase (name, authority, description) VALUES ('Swiss', NULL,
NULL)

So, I've tried to repeat the same sql query in manual mode and I'm getting this
error :

1205, 'Lock wait timeout exceeded; try restarting transaction'

This error seems to be related to Innodb transaction. 

Since, I'm very newb with InnoDB, cuz I've always been using MyISAM table I
really don't know what to do. Well, at least I'm gonna read on InnoDB table. 

Thanks !

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon | 6 Nov 2007 09:59

[Bug 2390] Error importing Swiss Prot in BioSQL

http://bugzilla.open-bio.org/show_bug.cgi?id=2390

------- Comment #11 from biopython-bugzilla <at> maubp.freeserve.co.uk  2007-11-06 03:59 EST -------
> Now, I've checked the uploading problem with the BioSQL database and it
> seems to me that It's getting stuck right at the beginning of the import
> with this sql query : 
>
> INSERT INTO biodatabase (name, authority, description)
> VALUES ('Swiss', NULL, NULL)

Do you know how to check the table schema, and see if authority and description
can be left empty/NULL?

I'm not sure what "authority" should be in this context, but it does strike me
as odd that the description is NULL - that could be a sequence file parsing
issue.  Does this happen for all files you've tried - what about a GenBank
example?

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon | 7 Nov 2007 05:55

[Bug 2351] Make Seq more like a string, even subclass string?

http://bugzilla.open-bio.org/show_bug.cgi?id=2351

------- Comment #10 from mdehoon <at> ims.u-tokyo.ac.jp  2007-11-06 23:55 EST -------
> One side effect of subclassing directly is the .data property will vanish
> (the internal string/array of the Seq/MutableSeq object). 
> Some people will be using this (especially as it was actually used in some
> older versions of the tutorial).

If we add
        self.data = self
in the __init__ method of the Seq/MutableSeq classes, then the .data property
can still be used as before, without using significantly more memory.

> I propose we make the Seq/MutableSeq object act more string like (fix
> str(my_seq) etc) for the next release and officially declare the .data
> deprecated in the documentation.  This should be backwards compatible - expect
> where anyone used the str(my_seq) to get a truncated string deliberately.  
> Then shift to actual subclasses for a later release.

As we can keep the seq.data property even after subclassing, how about
subclassing right away for the next release?

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Tiago Antao | 7 Nov 2007 12:09
Picon
Gravatar

resuming CVS updates

Hi,

Is it OK to resume with CVS updates of non-production code? (No rush or 
urgency, just to know...)

Tiago
--

-- 
tiagoantao <at> gmail.com
http://tiago.org/ps

Gmane