Mark Kirkwood | 1 Mar 2004 08:30
Picon
Favicon

Re: FreeBSD config


Shridhar Daithankar wrote:

> Dror Matalon wrote:
>
>> I've read Matt Dillon's discussion about the freebsd VM at
>> http://www.daemonnews.org/200001/freebsd_vm.html and I didn't see him
>> saying that Freebsd uses all the free RAM for disk cache. Would you care
>> to provide a URL pointing to that?
>
>
>
Quoting from http://www.daemonnews.org/200001/freebsd_vm.html :

<snip>*
When To Free a Page*

Since the VM system uses all available memory for disk caching, there 
are usually very few truly-free pages...
</snip>

Got to say - a very interesting discussion you have all being having, I 
am now quite confused about what those vfs.*buf* variables actually do...

Please feed back any clarfications from the FreeBSD experts to this list!

regards

Mark

(Continue reading)

Dror Matalon | 1 Mar 2004 08:39

Re: FreeBSD config

On Mon, Mar 01, 2004 at 08:30:58PM +1300, Mark Kirkwood wrote:
> 
> 
> Shridhar Daithankar wrote:
> 
> >Dror Matalon wrote:
> >
> >>I've read Matt Dillon's discussion about the freebsd VM at
> >>http://www.daemonnews.org/200001/freebsd_vm.html and I didn't see him
> >>saying that Freebsd uses all the free RAM for disk cache. Would you care
> >>to provide a URL pointing to that?
> >
> >
> >

I noticed this passage too, but ...
> Quoting from http://www.daemonnews.org/200001/freebsd_vm.html :
> 
> <snip>*
> When To Free a Page*
> 
> Since the VM system uses all available memory for disk caching, there 
        ^^^^^^^^^^^^^

The VM system, as you can see from the article, is focused on paging and
caching the programs and program data. Is the cache for disk reads and
writes thrown into the mix as well?

> are usually very few truly-free pages...
> </snip>
(Continue reading)

Mark Kirkwood | 1 Mar 2004 09:19
Picon
Favicon

Re: FreeBSD config


>I noticed this passage too, but ...
>  
>
>>Quoting from http://www.daemonnews.org/200001/freebsd_vm.html :
>>
>><snip>*
>>When To Free a Page*
>>
>>Since the VM system uses all available memory for disk caching, there 
>>    
>>
>        ^^^^^^^^^^^^^
>
>The VM system, as you can see from the article, is focused on paging and
>caching the programs and program data. Is the cache for disk reads and
>writes thrown into the mix as well?
>
>  
>
Yes - that is the real question. The following link :

http://www.freebsd.org/doc/en_US.ISO8859-1/books/arch-handbook/vm-cache.html

and the few "next" pages afterward talk about a unified buffer cache i.e 
file buffer cache is part of the KVM system which is part of the VM 
system - but this does not seem to preclude those vfs.*buf* variables 
limiting the size of the file buffer cache... hmm ... so no real 
decrease of confusion at this end.. :-)

(Continue reading)

Andrew Sullivan | 1 Mar 2004 15:34
Picon

Re: compiling 7.4.1 on Solaris 9

On Thu, Feb 26, 2004 at 12:46:23PM +0000, teknokrat wrote:
> I've read about the place. Would using -O3 be an improvement?

In my experience, it's not only not an improvement, it sometimes
breaks the code.  That's on 8, though, not 9.

A

--

-- 
Andrew Sullivan  | ajs <at> crankycanuck.ca
The plural of anecdote is not data.
		--Roger Brinner

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Anjan Dave | 1 Mar 2004 16:35
Favicon

Scaling further up

All:
 
We have a Quad-Intel XEON 2.0GHz (1MB cache), 12GB memory, running RH9, PG 7.4.0. There's an internal U320, 10K RPM RAID-10 setup on 4 drives.
 
We are expecting a pretty high load, a few thousands of 'concurrent' users executing either select, insert, update, statments.
 
What is the next step up in terms of  handling very heavy loads? Clustering?
 
Are there any standard, recommended clustering options?
 
 
Also, in terms of hardware, overall, what benefits more, a SunFire 880 (6 or 8 CPUs, lots of RAM, internal FC Drives) type of machine, or an IA-64 architecture?
 
Appreciate any inputs,
 
Thanks,
Anjan

**************************************************************************

This e-mail and any files transmitted with it are intended for the use of the addressee(s) only and may be confidential and covered by the attorney/client and other privileges. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.

 
Rob Fielding | 1 Mar 2004 18:30

Re: WAL Optimisation - configuration and usage


Further update to my WAL experimentation. pg_xlog files have increased 
to 81, and checking today up to 84. Currently nothing much going on with 
the server save a background process running a select every 30 seconds 
with almost no impact (according to IO from vmstats).

This in itself is a good sign - an improvement on running last week, but 
I'd still like to get clarification on WAL file usage if possible.

Log file tailing has nothing more interesting than a whole set of 
"recycled transaction log file" entries :

2004-03-01 16:01:55 DEBUG:  recycled transaction log file 0000007100000017
2004-03-01 16:07:01 DEBUG:  recycled transaction log file 0000007100000018
2004-03-01 16:17:14 DEBUG:  recycled transaction log file 0000007100000019
2004-03-01 16:22:20 DEBUG:  recycled transaction log file 000000710000001A
2004-03-01 16:32:31 DEBUG:  recycled transaction log file 000000710000001B
2004-03-01 16:37:36 DEBUG:  recycled transaction log file 000000710000001C
2004-03-01 16:47:48 DEBUG:  recycled transaction log file 000000710000001D
2004-03-01 16:52:54 DEBUG:  recycled transaction log file 000000710000001E
2004-03-01 17:03:05 DEBUG:  recycled transaction log file 000000710000001F

Looks kinda automated, but the times aren't quite even at around 6-10 
minutes apart.

cheers,
--

-- 

Rob Fielding
rob <at> dsvr.net

www.dsvr.co.uk              Development             Designer Servers Ltd

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Magnus Hagander | 1 Mar 2004 21:54

Re: Scaling further up

> All:
> 
> We have a Quad-Intel XEON 2.0GHz (1MB cache), 12GB memory, running
RH9, PG 7.4.0. There's 
> an internal U320, 10K RPM RAID-10 setup on 4 drives.
> 
> We are expecting a pretty high load, a few thousands of 'concurrent'
users executing either 
> select, insert, update, statments.

> What is the next step up in terms of  handling very heavy loads?
Clustering? 

I'd look at adding more disks first. Depending on what type of query
load you get, that box sounds like it will be very much I/O bound. More
spindles = more parallell operations = faster under load. Consider
adding 15KRPM disks as well, they're not all that much more expensive,
and should give you better performance than 10KRPM.

Also, make sure you put your WAL disks on a separate RAIDset if possible
(not just a separate partition on existing RAIDset).

Finally, if you don't already have it, look for a battery-backed RAID
controller that can do writeback-cacheing, and enable that. (Don't even
think about enabling it unless it's battery backed!) And add as much RAM
as you can to that controller.

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faqs/FAQ.html

Simon Riggs | 2 Mar 2004 01:27
Favicon
Gravatar

Re: WAL Optimisation - configuration and usage

>Rob Fielding wrote:
> My focus today has been on WAL - I've not looked at WAL before. By
> increasing the settings thus :
> 
> wal_buffers = 64 # need to determin WAL usage
> wal_files = 64 # range 0-64
> wal_sync_method = fsync   # the default varies across platforms:
> wal_debug = 0             # range 0-16
> 
> # hopefully this should see less LogFlushes per LogInsert - use more
WAL
> though.
> commit_delay = 10000     # range 0-100000
> commit_siblings = 2       # range 1-1000
> checkpoint_segments = 16   # in logfile segments (16MB each), min 1
> checkpoint_timeout = 300  # in seconds, range 30-3600
> fsync = true

> But I think we can get more out of this as the above setting were
picked
> from thin air  and my concern here is being able to determin WAL file
> usage and if the system is caught out on the other extreme that we're
> not commiting fast enough. Currently I've read that WAL files
shouldn't
> be more than 2*checkpoint_segments+1 however my pg_xlog directory
> contains 74 files. This suggests I'm using more logfiles than I
should.
> Also I'm not sure what wal_buffers really should be set to.

As Richard Huxton says, we're all learning...I'm looking at WAL logic
for other reasons right now...

This is based upon my reading of the code; I think the manual contains
at least one confusion that has not assisted your understanding (or
mine):

The WAL files limit of 2*checkpoint_segments+1 refers to the number of
files allocated-in-advance of the current log, not the total number of
files in use. pg uses a cycle of logs, reusing older ones when all the
transactions in those log files have been checkpointed. The limit is set
to allow checkpoint to release segments and have them all be reused at
once. Pg stores them up for use again later when workload hots up again.

If it cannot recycle a file because there is a still-current txn on the
end of the cycle, then it will allocate a new file and use this instead,
but still keeping everything in a cycle. Thus if transactions are
particularly long running, then the number of files in the cycle will
grow. So overall, normal behaviour so far. I don't think there's
anything to worry about in having that many files in your xlog cycle.

That behaviour is usually seen with occasional long running txns. When a
long running transaction is over, pg will try to reduce the number of
files in the cycle until its back to target. 

You seem to be reusing one file in the cycle every 10 mins - this is
happening as the result of a checkpoint timeout - "kinda automated" as
you say. [A checkpoint is the only time you can get the messages you're
getting] At one file per checkpoint, it will take 16*2+1=33
checkpoints*10 mins = 5 hours before it hits the advance allocation file
limit and then starts to reduce number of files. That's why they appear
to stay constant...

If you want to check whether this is correct, manually issue a number of
CHECKPOINT statements. The messages should change from "recycled" to
"removing" transaction log file once you've got to 33 checkpoints - the
number of WAL log files should start to go down also? If so, then
there's nothing too strange going on, just pg being a little slow in
reducing the number of wal log files.

So, it seems that you are running occasional very long transactions.
During that period you run up to 60-80 wal files. That's just on the
edge of your wal_buffers limit, which means you start to write wal
quicker than you'd like past that point. Your checkpoint_timeout is 300
seconds, but a checkpoint will also be called every checkpoint_segments,
or currently every 16 wal files. Since you go as high as 60-80 then you
are checkpointing 4-5 times during the heavy transaction period -
assuming it's all one block of work. In the end, each checkpoint is
causing a huge I/O storm, during which not much work happens. 

I would suggest that you reduce the effect of checkpointing by either:
- re-write app to do scan deletes in smaller chunks in quieter periods
or
- increase checkpoint_segments to 128, though this may effect your
recoverability

You can of course only do so much with the memory available to you. If
you increase one allocation of memory, you may have to reduce another
parameter and that may be counter productive.

[An alternative view is that you should go for more frequent, not less
frequent checkpoints in this situation, smoothing out the effect of the
checkpoints, rather than trying to avoid them at all. On the other hand,
that approach also increases total WAL log volume, which means you'll
make poor use of I/O and memory buffering. I'd stay high.]

However, I'm not sure 
- why checkpoint interval of 300 secs causes them to happen every 10
mins in quieter periods; is that an occaisional update occurring?
- why checkpoint only releases single Wal file each time - but that
maybe me just reading the code incorrectly.

Please set WAL_DEBUG to 1 so we can see a bit more info: thanks.

> Can I get any feedback on this ? How to look into pg's WAL usage would
> be what I'm looking for. BTW this is an old install I'm afraid 7.2.2 -
> it's been impossible to upgrade up until now because it's been too
slow.
> I have moved the pg_xlog onto the root SCSI disk - it doesn't appear
to
> have made a huge difference but it could be on the same cable.

My advice is don't touch WAL_SYNC_METHOD...

I **think** the WAL behaviour is still the same in 7.4.1, so no rush to
upgrade on that account - unless you're using temporary tables....

Best Regards, Simon Riggs

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo <at> postgresql.org)

Tom Lane | 2 Mar 2004 02:01
Picon

Re: WAL Optimisation - configuration and usage

"Simon Riggs" <simon <at> 2ndquadrant.com> writes:
> - why checkpoint interval of 300 secs causes them to happen every 10
> mins in quieter periods; is that an occaisional update occurring?

There is code in there to suppress a checkpoint if no WAL-loggable
activity has happened since the last checkpoint.  Not sure if that's
relevant to the issue or not though.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo <at> postgresql.org so that your
      message can get through to the mailing list cleanly

postgres | 2 Mar 2004 01:49
Picon

Re: Select-Insert-Query

Hi,

nobody has an idea? :-(

-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner <at> postgresql.org [mailto:pgsql-performance-owner <at> postgresql.org] Im Auftrag von postgres <at> countup.de
Gesendet: Freitag, 27. Februar 2004 17:53
An: pgsql-performance <at> postgresql.org
Betreff: [PERFORM] Select-Insert-Query

Hi,

what is the most performant way to select for example the first 99 rows of a table and insert them into another table...

at the moment i do this:

for userrecord in select * from table where account_id = a_account_id and counter_id = userrecord.counter_id and visitortable_id between a_minid and a_maxid limit 99 loop
insert into lastusers (account_id, counter_id, date, ip, hostname) values(a_account_id,userrecord.counter_id,userrecord.date ,userrecord.ip,userrecord.hostname);
end loop;

i think "limit" is a performance killer, is that right? but what to do instead

thanks
bye

Gmane