Paul J Stevens | 1 Nov 08:42 2004
Picon

Re: performance of speedup patches

Aaron Stone wrote:
> Paul J Stevens <paul@...> said:
> 
> [huge snip]
> 
>>Same goes for the dbmysql and my own db_getmailbox patches although that
>>last one requires an addition index on dbmail_messages to be effective.
> 
> 
> Does is degrade performance without the index? If not, I think we can put
> it in a gray area, and say that we're robust with or without the index,
> but it's a good idea to add it.

That's it. It'll work just fine without the index, but adding the index is recommended for performance sake.

--

-- 
   ________________________________________________________________
   Paul Stevens                                  mailto:paul@...
   NET FACILITIES GROUP                     PGP: finger paul@...
   The Netherlands________________________________http://www.nfg.nl
Hans Kristian Rosbach | 1 Nov 10:52 2004
Picon

Re: Some comments

On Fri, 2004-10-29 at 18:40, Sean Chittenden wrote:
> >> I'm going to brute-force by trying combinations until
> >> I get the wanted results now, but I do not really feel
> >> well with doing this as it might be wrong in corner-cases.
> >
> > Well, I think I have mostly figured out the basics of the
> > new schema now. So I made a simple drop-in replacement select
> > statement that just pulls out message-id, size, seen_flag and
> > timestamp. Problem is, the new layout results in a join of
> > 4 tables instead of 1. And the result is that it takes about
> > 5 times as long to execute.
> 
> If you PREPARE this statement in PostgreSQL and then EXECUTE it at a 
> later date, does it still take 5 times as long, or is this a MySQL only 
> penalization?  PostgreSQL and its optimizer/planner do really well with 
> joins.  Back in the day, I remember MySQL fell on its face after 2 
> joins so it wouldn't surprise me if MySQL would be slower with this, 
> but would it be for PostgreSQL?  -sc

Well, I guess I forgot to mention it..  I only run PgSQL, never
ever used MySQL. So I guess yes, it's slower..  =(

-=Dead2=-
Hans Kristian Rosbach | 1 Nov 11:00 2004
Picon

Re: Some comments

On Fri, 2004-10-29 at 19:01, Micah wrote:
> Just curious, but I'm wondering why you're joining to the messageblks
> table as it doesn't appear to have any bearing on the query?
> 
> Why not:
> 
> SELECT DISTINCT physmessage.id, physmessage.messagesize,
>  messages.seen_flag, physmessage.internal_date
> 
>  FROM dbmail_mailboxes mailboxes, dbmail_messages messages,
>  dbmail_physmessage physmessage
> 
>  WHERE mailboxes.owner_idnr='4' AND messages.status<'2'
> 
>  AND mailboxes.mailbox_idnr=messages.mailbox_idnr AND
>  messages.physmessage_id=physmessage.id
> 
>  ORDER BY physmessage.id DESC

Thanks, I did not see that. I was thinking I might be getting blind
on my own code.

It did cause a speedup, but not too big..  ~5-10% it seems.

> Not being familiar with the 2.0 schema yet, this might be a dumb comment,
> but as it's just a normal join, I don't think you're trying to verify that
> physmessage has a mate in messageblks.

I too am not familiar with the 2.0 schema yet.. And I do think that
some of the stuff that has changed since 1.0 were totally useless.
(Continue reading)

bugtrack | 1 Nov 11:05 2004

[DBMail 0000109]: update mailbox2dbmail to version 2.0


The following bug has been RESOLVED.
======================================================================
http://dbmail.org/mantis/bug_view_advanced_page.php?bug_id=0000109
======================================================================
Reported By:                danweber
Assigned To:                ilja
======================================================================
Project:                    DBMail
Bug ID:                     109
Category:                   General
Reproducibility:            always
Severity:                   feature
Priority:                   high
Status:                     resolved
Resolution:                 fixed
======================================================================
Date Submitted:             27-Oct-04 01:17 CEST
Last Modified:              01-Nov-04 11:05 CET
======================================================================
Summary:                    update mailbox2dbmail to version 2.0
Description: 
The version of mailbox2dbmail in cvs has a recursion limit problem.  Please
merge this with Cvs.  Thanks

Dan
http://mirrorlynx.com/~dan/mailbox2dbmail-2.0.tar.bz2

======================================================================

(Continue reading)

Ilja Booij | 1 Nov 11:45 2004
Picon

Re: Two bottlenecks in db_getmailbox

On Mon, 25 Oct 2004 22:17:36 +0200, Paul J Stevens <paul@...> wrote:
> 
> Ok Jesse, thanks for the pointer.
> 
> Looks like db_imap_append uses its own version of the insertion logic and still inserts messages with empty
> unique_ids during the insertion sequence. That should be fixed, if only be wrapping this in a transaction.
> Come to think of it, wouldn't transactions actually do?

Yep, transactions would do. I don't know if it was Aaron or me who
rewrote all the other message insertion code to not add an empty
unique id anymore. There is no problem in that code because the email
is first delivered to the INTERNAL_DELIVERY_USER. Later a single
atomic INSERT is done in the messages table. This INSERT copies the
message information from the message inserted into the
INTERNAL_DELIVERY_USER's mailbox to the recipient user's mailbox.

db_imap_append_msg does *not* use this logic. It adds a entry into
dbmail_messages, then adds the messageblocks, and finally finalizes
the entry in dbmail_messages by adding a unique ID.

I'll fix the function so the messageblocks are filled first, after
which the message is added to the messages table, like this:
1. create physmessage
2. add messageblocks
3. add message

The message will then get the right unique id directly, without having
to change the unique_id later on. In fact, I don't think the unique_id
of a message should ever be changed after having been set initially.
Functions like db_message_set_unique_id() should therefor be removed,
(Continue reading)

Hans Kristian Rosbach | 1 Nov 11:47 2004
Picon

Suggested schema changes (Was: Some comments)

> I too am not familiar with the 2.0 schema yet.. And I do think that
> some of the stuff that has changed since 1.0 were totally useless.
> I'm going to get digging to see wether this is true or not.
> 
> I might also propose a whole new schema, but I do not expect it to
> be used by others than me.

Ok, I've looked it over again..

I see no use for the dbmail_physmessage table. It can as far as I can
see be merged into dbmail_messages with only very minor fixes.

Advantages:
-Simplified schema
-Decreased storage size for database
-Increased speed

Disadvantages:
-none?

I also found an index that was not needed due to duplicates in a
multi-column index. (dbmail_messageblks_physmessage_idx)

I added indexing on deleted_flag and internal_date. This is useful
for webmail atleast.

And why do we have answered_flag, deleted_flag and so on in
dbmail_mailboxes?

I also think we lack some more indexes, but I have to do some
(Continue reading)

Paul J Stevens | 1 Nov 12:31 2004
Picon

Re: Suggested schema changes (Was: Some comments)


Hans Kristian Rosbach wrote:
>>I too am not familiar with the 2.0 schema yet.. And I do think that
>>some of the stuff that has changed since 1.0 were totally useless.
>>I'm going to get digging to see wether this is true or not.
>>
>>I might also propose a whole new schema, but I do not expect it to
>>be used by others than me.
> 
> 
> Ok, I've looked it over again..
> 
> I see no use for the dbmail_physmessage table. It can as far as I can
> see be merged into dbmail_messages with only very minor fixes.

Which was the situation pre-2.0

The physmessage table was added to the 2.0 setup for good reasons. IIRC, the 
main one was making imap copy and move commands *much* cheaper.  But I'm sure 
Ilja can explain the reasoning a bit better.

In fact, I dug up Roel's original design considerations:

http://mailman.fastxs.net/pipermail/dbmail-dev/2003-July/002528.html

--

-- 
   ________________________________________________________________
   Paul Stevens                                         paul@...
   NET FACILITIES GROUP                     GPG/PGP: 1024D/11F8CD31
   The Netherlands_______________________________________www.nfg.nl
(Continue reading)

Hans Kristian Rosbach | 1 Nov 13:02 2004
Picon

Re: Suggested schema changes (Was: Some comments)

On Mon, 2004-11-01 at 12:31, Paul J Stevens wrote:
> Hans Kristian Rosbach wrote:
> >>I too am not familiar with the 2.0 schema yet.. And I do think that
> >>some of the stuff that has changed since 1.0 were totally useless.
> >>I'm going to get digging to see wether this is true or not.
> >>
> >>I might also propose a whole new schema, but I do not expect it to
> >>be used by others than me.
> > 
> > 
> > Ok, I've looked it over again..
> > 
> > I see no use for the dbmail_physmessage table. It can as far as I can
> > see be merged into dbmail_messages with only very minor fixes.
> 
> Which was the situation pre-2.0
> 
> The physmessage table was added to the 2.0 setup for good reasons. IIRC, the 
> main one was making imap copy and move commands *much* cheaper.  But I'm sure 
> Ilja can explain the reasoning a bit better.
> 
> In fact, I dug up Roel's original design considerations:
> 
> http://mailman.fastxs.net/pipermail/dbmail-dev/2003-July/002528.html

Thanks for the explanation..

So, in order to get one message we need to look it up in this order:
Users->Mailboxes->Physmessages->mailblks

(Continue reading)

Magnus Sundberg | 1 Nov 13:21 2004
Picon

Re: Suggested schema changes (Was: Some comments)

Hans Kristian Rosbach wrote:

> 
> Well, this sucks.. But I guess it's nice for IMAP users. But does
> people actually copy messages that much? I've never done so myself,
> and I don't really see any big use for it. In my openion it is not
> worth it to make everything else slow and complex in order to speed
> up a seldomly used function. The move argument is not true I think,
> couldn't that be just as easily done using a simple update?
> 
> Still I see no use for answered_flag etc in Mailboxes?
> And the index changes should also be valid.
> 
There is no imap move command. The imap clients move messages by 
copying them from original to destination mailbox and then delete 
the original message.
This can also be useful when the same message is distributed to 
multiple recepients, then there is a theoretical possibility of 
just one message in the mail store.

Magnus
Hans Kristian Rosbach | 1 Nov 13:27 2004
Picon

Re: Suggested schema changes (Was: Some comments)

> So, in order to get one message we need to look it up in this order:
> Users->Mailboxes->Physmessages->mailblks
> 
> Well, this sucks.. But I guess it's nice for IMAP users. But does
> people actually copy messages that much? I've never done so myself,
> and I don't really see any big use for it. In my openion it is not
> worth it to make everything else slow and complex in order to speed
> up a seldomly used function. The move argument is not true I think,
> couldn't that be just as easily done using a simple update?

Well, I've taken a good look at it again..  And I cannot really see
that this is true..

Physmessages is not a layer between Mailboxes and Mailblks.

Physmessages only contains one id (it's own), and both
Mailboxes and Mailblks point to physmessages using that id.
So I still see no gain from using the Physmessages table.

CREATE SEQUENCE dbmail_physmessage_id_seq;
CREATE TABLE dbmail_physmessage (
   id INT8 DEFAULT nextval('dbmail_physmessage_id_seq'),
   messagesize INT8 DEFAULT '0' NOT NULL,
   rfcsize INT8 DEFAULT '0' NOT NULL,
   internal_date TIMESTAMP WITHOUT TIME ZONE,
   PRIMARY KEY(id)
);

See, it would need another id in there to be able to point
one mailblks into several mailboxes.
(Continue reading)


Gmane