Alvaro Herrera | 1 May 05:08 2010

Re: autovacuum strategy / parameters

Josh Berkus escribió:

> #autovacuum_vacuum_scale_factor = 0.2
> 
> This is set because in my experience, 20% bloat is about the level at
> which bloat starts affecting performance; thus, we want to vacuum at
> that level but not sooner.  This does mean that very large tables which
> never have more than 10% updates/deletes don't get vacuumed at all until
> freeze_age; this is a *good thing*. VACUUM on large tables is expensive;
> you don't *want* to vacuum a billion-row table which has only 100,000
> updates.

Hmm, now that we have partial vacuum, perhaps we should revisit this.

> It would be worth doing a DBT2/DBT5 test run with different autovac
> settings post-8.4 so see if we should specifically change the vacuum
> threshold.

Right.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

(Continue reading)

Robert Haas | 1 May 13:39 2010
Picon

Re: autovacuum strategy / parameters

On Fri, Apr 30, 2010 at 6:50 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
> Which is the opposite of my experience; currently we have several
> clients who have issues which required more-frequent analyzes on
> specific tables.

That's all fine, but probably not too relevant to the original
complaint - the OP backed off the default settings by several orders
of magnitude, which might very well cause a problem with both VACUUM
and ANALYZE.

I don't have a stake in the ground on what the right settings are, but
I think it's fair to say that if you vacuum OR analyze much less
frequently than what we recommend my default, it might break.

...Robert

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Scott Marlowe | 1 May 18:08 2010
Picon

Re: autovacuum strategy / parameters

On Wed, Apr 28, 2010 at 8:20 AM, Thomas Kellerer <spam_eater <at> gmx.net> wrote:
> Rick, 22.04.2010 22:42:
>>
>> So, in a large table, the scale_factor is the dominant term. In a
>> small table, the threshold is the dominant term. But both are taken into
>> account.
>>
>> The default values are set for small tables; it is not being run for
>> large tables.
>
> With 8.4 you can adjust the autovacuum settings per table...

You can as well with 8.3, but it's not made by alter table but by
pg_autovacuum table entries.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Scott Marlowe | 1 May 18:13 2010
Picon

Re: autovacuum strategy / parameters

On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
> Which is the opposite of my experience; currently we have several
> clients who have issues which required more-frequent analyzes on
> specific tables.   Before 8.4, vacuuming more frequently, especially on
> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
> with 8.4 it's not something you want to increase without thinking about
> the tradeoff

Actually I would think that statement would be be that before 8.3
vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
had to do with moving FSM to disk, making seldom vacuumed tables
easier to keep track of, and making autovac work better in the
presence of long running transactions.  The ability to tune IO load
etc was basically unchanged in 8.4.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Greg Smith | 1 May 19:11 2010

Re: autovacuum strategy / parameters

Robert Haas wrote:
> I don't have a stake in the ground on what the right settings are, but
> I think it's fair to say that if you vacuum OR analyze much less
> frequently than what we recommend my default, it might break.
>   

I think the default settings are essentially minimum recommended 
frequencies.  They aren't too terrible for the giant data warehouse case 
Josh was suggesting they came from--waiting until there's 20% worth of 
dead stuff before kicking off an intensive vacuum is OK when vacuum is 
expensive and you're mostly running big queries anyway.  And for smaller 
tables, the threshold helps it kick in a little earlier.  It's unlikely 
anyone wants to *increase* those, so that autovacuum runs even less; out 
of the box it's not tuned to run very often at all.

If anything, I'd expect people to want to increase how often it runs, 
for tables where much less than 20% dead is a problem.  The most common 
situation I've seen where that's the case is when you have a hotspot of 
heavily updated rows in a large table, and this may match some of the 
situations that Robert was alluding to seeing.  Let's say you have a big 
table where 0.5% of the users each update their respective records 
heavily, averaging 30 times each.  That's only going to result in 15% 
dead rows, so no autovacuum.  But latency for those users will suffer 
greatly, because they might have to do lots of seeking around to get 
their little slice of the data.

--

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg <at> 2ndQuadrant.com   www.2ndQuadrant.us
(Continue reading)

Tom Lane | 1 May 19:25 2010
Picon

Re: autovacuum strategy / parameters

Greg Smith <greg <at> 2ndquadrant.com> writes:
> If anything, I'd expect people to want to increase how often it runs, 
> for tables where much less than 20% dead is a problem.  The most common 
> situation I've seen where that's the case is when you have a hotspot of 
> heavily updated rows in a large table, and this may match some of the 
> situations that Robert was alluding to seeing.  Let's say you have a big 
> table where 0.5% of the users each update their respective records 
> heavily, averaging 30 times each.  That's only going to result in 15% 
> dead rows, so no autovacuum.  But latency for those users will suffer 
> greatly, because they might have to do lots of seeking around to get 
> their little slice of the data.

With a little luck, HOT will alleviate that case, since HOT updates can
be reclaimed without running vacuum per se.  I agree there's a risk
there though.

Now that partial vacuum is available, it'd be a real good thing to
revisit these numbers.

			regards, tom lane

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Robert Haas | 1 May 21:08 2010
Picon

Re: autovacuum strategy / parameters

On Sat, May 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe <at> gmail.com> wrote:
> On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
>> Which is the opposite of my experience; currently we have several
>> clients who have issues which required more-frequent analyzes on
>> specific tables.   Before 8.4, vacuuming more frequently, especially on
>> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
>> with 8.4 it's not something you want to increase without thinking about
>> the tradeoff
>
> Actually I would think that statement would be be that before 8.3
> vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
> had to do with moving FSM to disk, making seldom vacuumed tables
> easier to keep track of, and making autovac work better in the
> presence of long running transactions.  The ability to tune IO load
> etc was basically unchanged in 8.4.

What about http://www.postgresql.org/docs/8.4/static/storage-vm.html ?

...Robert

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Scott Marlowe | 1 May 21:17 2010
Picon

Re: autovacuum strategy / parameters

On Sat, May 1, 2010 at 1:08 PM, Robert Haas <robertmhaas <at> gmail.com> wrote:
> On Sat, May 1, 2010 at 12:13 PM, Scott Marlowe <scott.marlowe <at> gmail.com> wrote:
>> On Fri, Apr 30, 2010 at 4:50 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
>>> Which is the opposite of my experience; currently we have several
>>> clients who have issues which required more-frequent analyzes on
>>> specific tables.   Before 8.4, vacuuming more frequently, especially on
>>> large tables, was very costly; vacuum takes a lot of I/O and CPU.  Even
>>> with 8.4 it's not something you want to increase without thinking about
>>> the tradeoff
>>
>> Actually I would think that statement would be be that before 8.3
>> vacuum was much more expensive.  The changes to vacuum for 8.4 mostly
>> had to do with moving FSM to disk, making seldom vacuumed tables
>> easier to keep track of, and making autovac work better in the
>> presence of long running transactions.  The ability to tune IO load
>> etc was basically unchanged in 8.4.
>
> What about http://www.postgresql.org/docs/8.4/static/storage-vm.html ?

That really only has an effect no tables that aren't updated very
often.  Unless you've got a whole bunch of those, it's not that big of
a deal.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Gmane