Amit Kapila | 21 May 2013 12:23
Favicon

Re: Move unused buffers to freelist

On Tuesday, May 21, 2013 12:36 PM Amit Kapila wrote:
> On Monday, May 20, 2013 6:54 PM Robert Haas wrote:
> > On Thu, May 16, 2013 at 10:18 AM, Amit Kapila
> <amit.kapila <at> huawei.com>
> > wrote:
> > > Further Performance Data:
> > >
> > > Below data is for average 3 runs of 20 minutes
> > >
> > > Scale Factor   - 1200
> > > Shared Buffers - 7G
> >
> > These results are good but I don't get similar results in my own
> > testing.
> 
> Thanks for running detailed tests
> 
> > I ran pgbench tests at a variety of client counts and scale
> > factors, using 30-minute test runs and the following non-default
> > configuration parameters.
> >
> > shared_buffers = 8GB
> > maintenance_work_mem = 1GB
> > synchronous_commit = off
> > checkpoint_segments = 300
> > checkpoint_timeout = 15min
> > checkpoint_completion_target = 0.9
> > log_line_prefix = '%t [%p] '
> >
> > Here are the results.  The first field in each line is the number of
(Continue reading)

Michael Paquier | 21 May 2013 01:50
Picon

Removal of pageinspect--1.0.sql

Hi all,

The contrib module pageinspect has been upgraded to 1.1, but pageinspect--1.0.sql is still present in source code. Shouldn't it be removed? Please find patch attached.

Thanks
--
Michael
Attachment (20130521_pageinspect10_removal.patch): application/octet-stream, 2670 bytes

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Daniel Wood | 20 May 2013 21:30
Favicon

FK locking concurrency improvement

As part of 0ac5ad5134f2769ccbaefec73844f8504c4d6182
the permutations in test/isolation/fk-deadlock2.spec and elsewhere were 
removed.  Is it the intent that these tests no longer do anything 
useful?  I was expecting a failure in the test with some work I'm doing 
and was confused, after a merge from the upstream 9.3, that the test 
didn't fail until I noticed the test is no longer running the permutations.

FYI, I saw some comments and adding fflush's into isolationtester.c.  I 
ran into the same problem with debugging tests when they failed/hung in 
the middle.  A simple "setbuf(stdout, NULL)" at the beginning of main 
gets rid of the problem where line buffering becomes block buffering 
when redirecting stdout to a file.  This causes problems with sequencing 
of mixed stderr and stdout and not seeing the last few lines of stdout 
if the process fails or hangs.  The setbuf on stdout shown above 
disables buffering of stdout to match the unbuffered stderr.

That way you don't need to fflush after each printf/fprintf.  I'm not 
sure why fflush of stderr was added because it isn't buffered to begin 
with so is unnecessary.  The problem was with stdout.  YMMV on windows 
but might work.

- Dan

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alfred Perlstein | 20 May 2013 17:31
Picon
Favicon
Gravatar

PGCON meetup FreeNAS/FreeBSD: In Ottawa Tue & Wed.

Hello PostgreSQL Hackers,

I am now in Ottawa, last week we wrapped up the BSDCon and I was hoping 
to chat with a few Postgresql developers in person about using 
Postgresql in FreeNAS and offering it as an extension to the platform as 
a plug-in technology.  Unfortunately due to time constraints I can not 
attend the entire conference and I am only in town until Wednesday at noon.

I'm hoping there's a good time to talk to a few developers about 
Postgresql + FreeNAS before I have to depart back to the bay area.

Some info on me:  My name is Alfred Perlstein, I am a FreeBSD developer 
and FreeNAS project lead.  I am the VP of Software Engineering at 
iXsystems.  I have been a fan of Postgresql for many years.  In the 
early 2000s we build a high speed web tracking application on top of 
Postgresql and worked closely with the community to shake out 
performance and bug, so closely that Tom Lane and Vadim Mikheev////had 
logins on our box.  Since that time I have tried to get Postgresql into 
as many places as possible.

Some info on the topics I wanted to briefly discuss:

1) Using Postgresql as the config store for FreeNAS.
We currently use SQLITE, SQLITE fits our needs until we get to the point 
of replication between HA (high availability) units.  Then we are forced 
to manually sync data between configurations.  A discussion on how we 
might do this better using Postgresql, while still maintaining our ease 
of config export (single file) and small footprint would be interesting.

2) Postgresql plugin for FreeNAS.
Flip a switch and suddenly your file server is also serving enterprise 
data.  We currently have a plug-in architecture, but would like to 
discuss the possibility of a tighter integration so that Postgresql 
looks like a more cohesive addition to FreeNAS.

3) Statistic monitoring / EagleEye
In FreeBSD/FreeNAS I have developed a system called EagleEye. EagleEye 
is a system where all mibs are easily exportable with timestamps in a 
common format (for now YAML & modified CSV) which is then consumed by a 
utility which can then provide graphs. The entire point of EagleEye is 
to eventually upstream the modifications to future proof statistics 
tracking into the FreeBSD and FreeNAS systems.  I have spoken with some 
Illuminos/ZFS developers and they are interested as well.

I think that is all I have, please drop me a note if you'll have some 
time in Ottawa today, tomorrow or early Wednesday.  I'd love to discuss 
and buy some beers for the group.

thank you,
-Alfred Perlstein
VP Software Engineering, iXsystems.

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas | 20 May 2013 15:24
Picon

Re: Move unused buffers to freelist

On Thu, May 16, 2013 at 10:18 AM, Amit Kapila <amit.kapila <at> huawei.com> wrote:
> Further Performance Data:
>
> Below data is for average 3 runs of 20 minutes
>
> Scale Factor   - 1200
> Shared Buffers - 7G

These results are good but I don't get similar results in my own
testing.  I ran pgbench tests at a variety of client counts and scale
factors, using 30-minute test runs and the following non-default
configuration parameters.

shared_buffers = 8GB
maintenance_work_mem = 1GB
synchronous_commit = off
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
log_line_prefix = '%t [%p] '

Here are the results.  The first field in each line is the number of
clients.  The second number is the scale factor.  The numbers after
"master" and "patched" are the median of three runs.

01 100 master 1433.297699 patched 1420.306088
01 300 master 1371.286876 patched 1368.910732
01 1000 master 1056.891901 patched 1067.341658
01 3000 master 637.312651 patched 685.205011
08 100 master 10575.017704 patched 11456.043638
08 300 master 9262.601107 patched 9120.925071
08 1000 master 1721.807658 patched 1800.733257
08 3000 master 819.694049 patched 854.333830
32 100 master 26981.677368 patched 27024.507600
32 300 master 14554.870871 patched 14778.285400
32 1000 master 1941.733251 patched 1990.248137
32 3000 master 846.654654 patched 892.554222

And here's the same results for 5-minute, read-only tests:

01 100 master 9361.073952 patched 9049.553997
01 300 master 8640.235680 patched 8646.590739
01 1000 master 8339.364026 patched 8342.799468
01 3000 master 7968.428287 patched 7882.121547
08 100 master 71311.491773 patched 71812.899492
08 300 master 69238.839225 patched 70063.632081
08 1000 master 34794.778567 patched 65998.468775
08 3000 master 60834.509571 patched 61165.998080
32 100 master 203168.264456 patched 205258.283852
32 300 master 199137.276025 patched 200391.633074
32 1000 master 177996.853496 patched 176365.732087
32 3000 master 149891.147442 patched 148683.269107

Something appears to have screwed up my results for 8 clients  <at>  scale
factor 300 on master, but overall, on both the read-only and
read-write tests, I'm not seeing anything that resembles the big gains
you reported.

Tests were run on a 16-core, 64-hwthread PPC64 machine provided to the
PostgreSQL community courtesy of IBM.  Fedora 16, Linux kernel 3.2.6.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Soroosh Sardari | 20 May 2013 09:28
Picon
Gravatar

Why there is a union in HeapTupleHeaderData struct

Dear Hackers

In fix part oh HeapTuple, there is a union that is named t_choice,
union
    {
        HeapTupleFields t_heap;
        DatumTupleFields t_datum;
    }            t_choice;

I can't find out why we need t_datum, actually there is no comment about DatumTupleFields.

Regards
Soroosh


Chris Farmiloe | 20 May 2013 03:54
Picon
Gravatar

ASYNC Privileges proposal

Hey all,

I find the current LISTEN / NOTIFY rather limited in the context of databases with multiple roles. As it stands it is not possible to restrict the use of LISTEN or NOTIFY to specific roles, and therefore notifications (and their payloads) cannot really be trusted as coming from any particular source.

If the payloads of notifications could be trusted, then applications could make better use of them, without fear of leaking any sensitive information to anyone who shouldn't be able to see it. 

I'd like to propose a new ASYNC database privilege that would control whether a role can use LISTEN, NOTIFY and UNLISTEN statements and the associated pg_notify function.

ie: 
GRANT ASYNC ON DATABASE xxxx TO bob;
REVOKE ASYNC ON DATABASE xxxx FROM bob;

SECURITY DEFINER functions could then be used anywhere that a finer grained access control was required.

I had a quick play to see what might be involved [attached], and would like to hear people thoughts; good idea, bad idea, not like that! etc  

Chris.

Attachment (async_privileges_r0.patch): application/octet-stream, 7410 bytes

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff Janes | 19 May 2013 22:48
Picon

psql \watch versus \timing


I'd like to run same query repeatedly and see how long it takes each time.

I thought \watch would be excellent for this, but it turns out that using \watch suppresses the output of \timing.

Is this intentional, or unavoidable?  

Also, is it just or does the inability to watch more frequently than once a second make it a lot less useful than it could be?


Cheers,

Jeff
Greg Smith | 19 May 2013 10:15
Favicon

Block write statistics WIP

I have some time now for working on the mystery of why there are no 
block write statistics in the database.  I hacked out the statistics 
collection and reporting side already, rough patch attached if you want 
to see the boring parts.

I guessed that there had to be a gotcha behind why this wasn't done 
before now, and I've found one so far.  All of the read statistics are 
collected with a macro that expects to know a Relation number.  Callers 
to ReadBuffer pass one.  On the write side, the two places that 
increment the existing write counters (pg_stat_statements, vacuum) are 
MarkBufferDirty and MarkBufferDirtyHint.  Neither of those is given a 
Relation though.  Inspecting the Buffer passed can only find the buffer 
tag's RelFileNode.

I've thought of two paths to get a block write count out of that so far:

-Provide a function to find the Relation from the RelFileNode.  There is 
a warning about the perils of assuming you can map that way from a 
buftag value in buf_internals.h though:

"Note: the BufferTag data must be sufficient to determine where to write 
the block, without reference to pg_class or pg_tablespace entries.  It's 
possible that the backend flushing the buffer doesn't even believe the 
relation is visible yet (its xact may have started before the xact that 
created the rel).  The storage manager must be able to cope anyway."

-Modify every caller of MarkDirty* to include a relation when that 
information is available.  There are about 200 callers of those 
functions around, so that won't be fun.  I noted that many of them are 
in the XLog replay functions, which won't have the relation--but those 
aren't important to count anyway.

Neither of these options feels very good to me, so selecting between the 
two feels like picking the lesser of two evils.  I'm fine with chugging 
through all of the callers to modify MarkDirty*, but I didn't want to do 
that only to have the whole approach rejected as wrong.  That's why I 
stopped here to get some feedback.

The way that MarkDirty requires this specific underlying storage manager 
behavior to work properly strikes me as as a bit of a layering violation 
too.  I'd like the read and write paths to have a similar API, but here 
they don't even operate on the same type of inputs.  Addressing that is 
probably harder than just throwing a hack on the existing code though.

-- 
Greg Smith   2ndQuadrant US    greg <at> 2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a03bfa6..1773d59 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
 <at>  <at>  -467,15 +467,19  <at>  <at>  CREATE VIEW pg_statio_all_tables AS
             pg_stat_get_blocks_fetched(C.oid) -
                     pg_stat_get_blocks_hit(C.oid) AS heap_blks_read,
             pg_stat_get_blocks_hit(C.oid) AS heap_blks_hit,
+            pg_stat_get_blocks_dirtied(C.oid) AS heap_blks_dirtied,
             sum(pg_stat_get_blocks_fetched(I.indexrelid) -
                     pg_stat_get_blocks_hit(I.indexrelid))::bigint AS idx_blks_read,
             sum(pg_stat_get_blocks_hit(I.indexrelid))::bigint AS idx_blks_hit,
+            sum(pg_stat_get_blocks_dirtied(I.indexrelid))::bigint AS idx_blks_dirtied,
             pg_stat_get_blocks_fetched(T.oid) -
                     pg_stat_get_blocks_hit(T.oid) AS toast_blks_read,
             pg_stat_get_blocks_hit(T.oid) AS toast_blks_hit,
+            pg_stat_get_blocks_dirtied(T.oid) AS toast_blks_dirtied,
             pg_stat_get_blocks_fetched(X.oid) -
                     pg_stat_get_blocks_hit(X.oid) AS tidx_blks_read,
             pg_stat_get_blocks_hit(X.oid) AS tidx_blks_hit
+            pg_stat_get_blocks_dirted(X.oid) AS tidx_blks_dirtied
     FROM pg_class C LEFT JOIN
             pg_index I ON C.oid = I.indrelid LEFT JOIN
             pg_class T ON C.reltoastrelid = T.oid LEFT JOIN
 <at>  <at>  -530,6 +534,7  <at>  <at>  CREATE VIEW pg_statio_all_indexes AS
             pg_stat_get_blocks_fetched(I.oid) -
                     pg_stat_get_blocks_hit(I.oid) AS idx_blks_read,
             pg_stat_get_blocks_hit(I.oid) AS idx_blks_hit
+            pg_stat_get_blocks_dirtied(I.oid) AS idx_blks_dirtied
     FROM pg_class C JOIN
             pg_index X ON C.oid = X.indrelid JOIN
             pg_class I ON I.oid = X.indexrelid
 <at>  <at>  -554,6 +559,7  <at>  <at>  CREATE VIEW pg_statio_all_sequences AS
             pg_stat_get_blocks_fetched(C.oid) -
                     pg_stat_get_blocks_hit(C.oid) AS blks_read,
             pg_stat_get_blocks_hit(C.oid) AS blks_hit
+            pg_stat_get_blocks_dirtied(C.oid) AS blks_dirtied
     FROM pg_class C
             LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
     WHERE C.relkind = 'S';
 <at>  <at>  -622,6 +628,7  <at>  <at>  CREATE VIEW pg_stat_database AS
             pg_stat_get_db_blocks_fetched(D.oid) -
                     pg_stat_get_db_blocks_hit(D.oid) AS blks_read,
             pg_stat_get_db_blocks_hit(D.oid) AS blks_hit,
+            pg_stat_get_db_blocks_dirtied(D.oid) AS blks_dirtied,
             pg_stat_get_db_tuples_returned(D.oid) AS tup_returned,
             pg_stat_get_db_tuples_fetched(D.oid) AS tup_fetched,
             pg_stat_get_db_tuples_inserted(D.oid) AS tup_inserted,
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 29d986a..2a720f4 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
 <at>  <at>  -3350,6 +3350,7  <at>  <at>  reset_dbentry_counters(PgStat_StatDBEntry *dbentry)
 	dbentry->n_xact_rollback = 0;
 	dbentry->n_blocks_fetched = 0;
 	dbentry->n_blocks_hit = 0;
+	dbentry->n_blocks_dirtied = 0;
 	dbentry->n_tuples_returned = 0;
 	dbentry->n_tuples_fetched = 0;
 	dbentry->n_tuples_inserted = 0;
 <at>  <at>  -3454,6 +3455,7  <at>  <at>  pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
 		result->changes_since_analyze = 0;
 		result->blocks_fetched = 0;
 		result->blocks_hit = 0;
+		result->blocks_dirtied = 0;
 		result->vacuum_timestamp = 0;
 		result->vacuum_count = 0;
 		result->autovac_vacuum_timestamp = 0;
 <at>  <at>  -4517,6 +4519,7  <at>  <at>  pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 			tabentry->changes_since_analyze = tabmsg->t_counts.t_changed_tuples;
 			tabentry->blocks_fetched = tabmsg->t_counts.t_blocks_fetched;
 			tabentry->blocks_hit = tabmsg->t_counts.t_blocks_hit;
+			tabentry->blocks_dirtied = tabmsg->t_counts.t_blocks_dirtied;

 			tabentry->vacuum_timestamp = 0;
 			tabentry->vacuum_count = 0;
 <at>  <at>  -4544,6 +4547,7  <at>  <at>  pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 			tabentry->changes_since_analyze += tabmsg->t_counts.t_changed_tuples;
 			tabentry->blocks_fetched += tabmsg->t_counts.t_blocks_fetched;
 			tabentry->blocks_hit += tabmsg->t_counts.t_blocks_hit;
+			tabentry->blocks_dirtied += tabmsg->t_counts.t_blocks_dirtied;
 		}

 		/* Clamp n_live_tuples in case of negative delta_live_tuples */
 <at>  <at>  -4561,6 +4565,7  <at>  <at>  pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 		dbentry->n_tuples_deleted += tabmsg->t_counts.t_tuples_deleted;
 		dbentry->n_blocks_fetched += tabmsg->t_counts.t_blocks_fetched;
 		dbentry->n_blocks_hit += tabmsg->t_counts.t_blocks_hit;
+		dbentry->n_blocks_dirtied += tabmsg->t_counts.t_blocks_dirtied;
 	}
 }

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 1c41428..9009d9e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
 <at>  <at>  -1009,6 +1009,15  <at>  <at>  MarkBufferDirty(Buffer buffer)
 	 */
 	if (!(bufHdr->flags & BM_DIRTY))
 	{
+		/*
+		 * TODO This is not a legitimate lookup of the relation that
+		 * pgstat_count_buffer_dirtied is normally looking for.
+		 * This is the relation relfilnode.  Comments in buf_internals.h
+		 * point out that the buffer manager might be writing data for
+		 * relations that aren't visible yet.
+		 */
+		pgstat_count_buffer_dirtied(bufHdr->tag->rnode);
+
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
 		if (VacuumCostActive)
 <at>  <at>  -2700,6 +2709,14  <at>  <at>  MarkBufferDirtyHint(Buffer buffer)

 		if (dirtied)
 		{
+			/*
+			 * TODO This is not a legitimate lookup of the relation that
+			 * pgstat_count_buffer_dirtied is normally looking for.
+			 * This is the relation relfilnode.  Comments in buf_internals.h
+			 * point out that the buffer manager might be writing data for
+			 * relations that aren't visible yet.
+			 */
+			pgstat_count_buffer_dirtied(bufHdr->tag->rnode);
 			VacuumPageDirty++;
 			if (VacuumCostActive)
 				VacuumCostBalance += VacuumCostPageDirty;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 8c1a767..c9e28a3 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
 <at>  <at>  -36,6 +36,7  <at>  <at>  extern Datum pg_stat_get_live_tuples(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_dead_tuples(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_blocks_hit(PG_FUNCTION_ARGS);
+extern Datum pg_stat_get_blocks_dirtied(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_last_vacuum_time(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_last_autovacuum_time(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_last_analyze_time(PG_FUNCTION_ARGS);
 <at>  <at>  -68,6 +69,7  <at>  <at>  extern Datum pg_stat_get_db_xact_commit(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_xact_rollback(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_blocks_fetched(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_blocks_hit(PG_FUNCTION_ARGS);
+extern Datum pg_stat_get_db_blocks_dirtied(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_tuples_returned(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_tuples_fetched(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_tuples_inserted(PG_FUNCTION_ARGS);
 <at>  <at>  -107,6 +109,7  <at>  <at>  extern Datum pg_stat_get_xact_tuples_deleted(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_xact_tuples_hot_updated(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_xact_blocks_fetched(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_xact_blocks_hit(PG_FUNCTION_ARGS);
+extern Datum pg_stat_get_xact_blocks_dirtied(PG_FUNCTION_ARGS);

 extern Datum pg_stat_get_xact_function_calls(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_xact_function_total_time(PG_FUNCTION_ARGS);
 <at>  <at>  -297,6 +300,21  <at>  <at>  pg_stat_get_blocks_hit(PG_FUNCTION_ARGS)
 }

 Datum
+pg_stat_get_blocks_dirtied(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int64		result;
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		result = 0;
+	else
+		result = (int64) (tabentry->blocks_dirtied);
+
+	PG_RETURN_INT64(result);
+}
+
+Datum
 pg_stat_get_last_vacuum_time(PG_FUNCTION_ARGS)
 {
 	Oid			relid = PG_GETARG_OID(0);
 <at>  <at>  -1120,6 +1138,22  <at>  <at>  pg_stat_get_db_blocks_hit(PG_FUNCTION_ARGS)

 
 Datum
+pg_stat_get_db_blocks_dirtied(PG_FUNCTION_ARGS)
+{
+	Oid			dbid = PG_GETARG_OID(0);
+	int64		result;
+	PgStat_StatDBEntry *dbentry;
+
+	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
+		result = 0;
+	else
+		result = (int64) (dbentry->n_blocks_dirtied);
+
+	PG_RETURN_INT64(result);
+}
+
+
+Datum
 pg_stat_get_db_tuples_returned(PG_FUNCTION_ARGS)
 {
 	Oid			dbid = PG_GETARG_OID(0);
 <at>  <at>  -1611,6 +1645,21  <at>  <at>  pg_stat_get_xact_blocks_hit(PG_FUNCTION_ARGS)
 }

 Datum
+pg_stat_get_xact_blocks_dirtied(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	int64		result;
+	PgStat_TableStatus *tabentry;
+
+	if ((tabentry = find_tabstat_entry(relid)) == NULL)
+		result = 0;
+	else
+		result = (int64) (tabentry->t_counts.t_blocks_dirtied);
+
+	PG_RETURN_INT64(result);
+}
+
+Datum
 pg_stat_get_xact_function_calls(PG_FUNCTION_ARGS)
 {
 	Oid			funcid = PG_GETARG_OID(0);
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index feecbf9..8a7f073 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
 <at>  <at>  -2597,6 +2597,8  <at>  <at>  DATA(insert OID = 1934 (  pg_stat_get_blocks_fetched	PGNSP PGUID 12 1 0 0 0 f f
 DESCR("statistics: number of blocks fetched");
 DATA(insert OID = 1935 (  pg_stat_get_blocks_hit		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_
_null_ _null_ pg_stat_get_blocks_hit _null_ _null_ _null_ ));
 DESCR("statistics: number of blocks found in cache");
+DATA(insert OID = 3177 (  pg_stat_get_blocks_dirtied		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_
_null_ _null_ _null_ pg_stat_get_blocks_hit _null_ _null_ _null_ ));
+DESCR("statistics: number of blocks dirtied");
 DATA(insert OID = 2781 (  pg_stat_get_last_vacuum_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 1184 "26" _null_
_null_ _null_ _null_	pg_stat_get_last_vacuum_time _null_ _null_ _null_ ));
 DESCR("statistics: last manual vacuum time for a table");
 DATA(insert OID = 2782 (  pg_stat_get_last_autovacuum_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 1184 "26"
_null_ _null_ _null_ _null_	pg_stat_get_last_autovacuum_time _null_ _null_ _null_ ));
 <at>  <at>  -2651,6 +2653,8  <at>  <at>  DATA(insert OID = 1944 (  pg_stat_get_db_blocks_fetched PGNSP PGUID 12 1 0 0 0 f
 DESCR("statistics: blocks fetched for database");
 DATA(insert OID = 1945 (  pg_stat_get_db_blocks_hit		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_ _null_
_null_ _null_ pg_stat_get_db_blocks_hit _null_ _null_ _null_ ));
 DESCR("statistics: blocks found in cache for database");
+DATA(insert OID = 3178 (  pg_stat_get_db_blocks_dirtied		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_
_null_ _null_ _null_ pg_stat_get_db_blocks_hit _null_ _null_ _null_ ));
+DESCR("statistics: blocks dirtied in cache for database");
 DATA(insert OID = 2758 (  pg_stat_get_db_tuples_returned PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_
_null_ _null_ _null_ pg_stat_get_db_tuples_returned _null_ _null_ _null_ ));
 DESCR("statistics: tuples returned for database");
 DATA(insert OID = 2759 (  pg_stat_get_db_tuples_fetched PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 20 "26" _null_
_null_ _null_ _null_ pg_stat_get_db_tuples_fetched _null_ _null_ _null_ ));
 <at>  <at>  -2733,6 +2737,8  <at>  <at>  DATA(insert OID = 3044 (  pg_stat_get_xact_blocks_fetched		PGNSP PGUID 12 1 0 0
 DESCR("statistics: number of blocks fetched in current transaction");
 DATA(insert OID = 3045 (  pg_stat_get_xact_blocks_hit			PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 20 "26" _null_
_null_ _null_ _null_ pg_stat_get_xact_blocks_hit _null_ _null_ _null_ ));
 DESCR("statistics: number of blocks found in cache in current transaction");
+DATA(insert OID = 3179 (  pg_stat_get_xact_blocks_dirtied		PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 20 "26"
_null_ _null_ _null_ _null_ pg_stat_get_xact_blocks_hit _null_ _null_ _null_ ));
+DESCR("statistics: number of blocks dirtied in current transaction");
 DATA(insert OID = 3046 (  pg_stat_get_xact_function_calls		PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 20 "26" _null_
_null_ _null_ _null_ pg_stat_get_xact_function_calls _null_ _null_ _null_ ));
 DESCR("statistics: number of function calls in current transaction");
 DATA(insert OID = 3047 (  pg_stat_get_xact_function_total_time	PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 701 "26"
_null_ _null_ _null_ _null_ pg_stat_get_xact_function_total_time _null_ _null_ _null_ ));
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fb242e4..d79577e 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
 <at>  <at>  -97,6 +97,7  <at>  <at>  typedef struct PgStat_TableCounts

 	PgStat_Counter t_blocks_fetched;
 	PgStat_Counter t_blocks_hit;
+	PgStat_Counter t_blocks_dirtied;
 } PgStat_TableCounts;

 /* Possible targets for resetting cluster-wide shared values */
 <at>  <at>  -528,6 +529,7  <at>  <at>  typedef struct PgStat_StatDBEntry
 	PgStat_Counter n_xact_rollback;
 	PgStat_Counter n_blocks_fetched;
 	PgStat_Counter n_blocks_hit;
+	PgStat_Counter n_blocks_dirtied;
 	PgStat_Counter n_tuples_returned;
 	PgStat_Counter n_tuples_fetched;
 	PgStat_Counter n_tuples_inserted;
 <at>  <at>  -581,6 +583,7  <at>  <at>  typedef struct PgStat_StatTabEntry

 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;
+	PgStat_Counter blocks_dirtied;

 	TimestampTz vacuum_timestamp;		/* user initiated vacuum */
 	PgStat_Counter vacuum_count;
 <at>  <at>  -834,6 +837,11  <at>  <at>  extern void pgstat_initstats(Relation rel);
 		if ((rel)->pgstat_info != NULL)								\
 			(rel)->pgstat_info->t_counts.t_blocks_hit++;			\
 	} while (0)
+#define pgstat_count_buffer_dirty(rel)								\
+	do {															\
+		if ((rel)->pgstat_info != NULL)								\
+			(rel)->pgstat_info->t_counts.t_blocks_dirty++;			\
+	} while (0)
 #define pgstat_count_buffer_read_time(n)							\
 	(pgStatBlockReadTime += (n))
 #define pgstat_count_buffer_write_time(n)							\

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Josh Berkus | 18 May 2013 23:34

pgbench vs. SERIALIZABLE

Hackers,

I recently had a reason to benchmark a database which is default
SERIALIZABLE mode.  I was startled to discover that pgbench is set up to
abort the client once it hits a serialization failure.  You get a bunch
of these:

Client 7 aborted in state 11: ERROR:  could not serialize access due to
read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.
HINT:  The transaction might succeed if retried.
Client 0 aborted in state 11: ERROR:  could not serialize access due to
read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.

... which continue until you're down to one client, which then finished
out the pgbench (at very low rates, of course).

The problem is this code here:

                if (commands[st->state]->type == SQL_COMMAND)
                {
                        /*

                         * Read and discard the query result; note this
is not included in
                         * the statement latency numbers.

                         */
                        res = PQgetResult(st->con);
                        switch (PQresultStatus(res))
                        {
                                case PGRES_COMMAND_OK:
                                case PGRES_TUPLES_OK:
                                        break;          /* OK */
                                default:
                                        fprintf(stderr, "Client %d
aborted in state %d: %s",
                                                        st->id,
st->state, PQerrorMessage(st->con));
                                        PQclear(res);
                                        return clientDone(st, false);
                        }
                        PQclear(res);
                        discard_response(st);

The way I read that, if the client encounters any errors at all, it
gives up and halts that client.  This doesn't seem very robust, and it
certainly won't work with SERIALIZABLE.

My thinking is that what pgbench should do is:
* track an error count
* if it finds an error, don't increment the transaction count, but do
increment the error count.
* then continue to the next transaction.

Does that seem like the right approach?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Soroosh Sardari | 18 May 2013 16:31
Picon
Gravatar

Road map to study about fetching a set of tuples - novice!

Hi

I was tracing a simple SELECT query to find how pg works for fetching tuples.
but I'm totally lost in the code.
Could you help me to understand under the hood?
I know about parsing and planning parts, my actual problem is executer.
If you show me a road map to study, I would appreciate it.

Regards
Soroosh

Gmane