== PostgreSQL Weekly News - July 06 2014 ==
David Fetter <david <at> fetter.org>
2014-07-07 04:44:17 GMT
== PostgreSQL Weekly News - July 06 2014 ==
PGConf.EU 2014 in Madrid, Spain on October 21-24 is now open for
PGDay.IT 2014 will take place in Prato on November the 7th 2014. The
International Call For Papers is now open:
== PostgreSQL Product News ==
Database Designer for PostgreSQL 1.10.0, a CASE tool which works
natively under the Windows OS family, released.
== PostgreSQL Jobs for July ==
== PostgreSQL Local ==
Char(14) and PGday UK will be held July 8 and 9, 2014.
PgDay Portland, Oregon 2014 will be held Saturday September 6, 2014.
Postgres Open 2014 will be in Chicago, IL, USA, September 17-19.
Tickets and Tutorials now available for purchase.
The sixth PGDay Cubano be held on 13 and 14 October 2014 in Habana.
PostgreSQL Conference Europe 2014 will be held on October 21-24 in
Madrid, Spain, at the Hotel Miguel Angel.
== PostgreSQL in the News ==
Planet PostgreSQL: http://planet.postgresql.org/
PostgreSQL Weekly News is brought to you this week by David Fetter
Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david <at> fetter.org, German language
to pwn <at> pgug.de, Italian language to pwn <at> itpug.org. Spanish language
to pwn <at> arpug.com.ar.
== Applied Patches ==
Heikki Linnakangas pushed:
- Fix and enhance the assertion of no palloc's in a critical section.
The assertion failed if WAL_DEBUG or LWLOCK_STATS was enabled; fix
that by using separate memory contexts for the allocations made
within those code blocks. This patch introduces a mechanism for
marking any memory context as allowed in a critical section.
Previously ErrorContext was exempt as a special case. Instead of a
blanket exception of the checkpointer process, only exempt the
memory context used for the pending ops hash table.
Andres Freund pushed:
- Fix typos in the cluster_name commit. Thom Brown and Fujii Masao
- Check interrupts during logical decoding more frequently. When
reading large amounts of preexisting WAL during logical decoding
using the SQL interface we possibly could fail to check interrupts
in due time. Similarly the same could happen on systems with a very
high WAL volume while creating a new logical replication slot,
independent of the used interface. Previously these checks where
only performed in xlogreader's read_page callbacks, while waiting
for new WAL to be produced. That's not sufficient though, if there's
never a need to wait. Walsender's send loop already contains a
interrupt check. Backpatch to 9.4 where the logical decoding
feature was introduced.
- Rename logical decoding's pg_llog directory to pg_logical. The old
name wasn't very descriptive as of actual contents of the directory,
which are historical snapshots in the snapshots/ subdirectory and
mappingdata for rewritten tuples in mappings/. There's been a fair
amount of discussion what would be a good name. I'm settling for
pg_logical because it's likely that further data around logical
decoding and replication will need saving in the future. Also add
the missing entry for the directory into storage.sgml's list of
PGDATA contents. Bumps catversion as the data directories won't be
- Fix decoding of MULTI_INSERTs when rows other than the last are
toasted. When decoding the results of a HEAP2_MULTI_INSERT
(currently only generated by COPY FROM) toast columns for all but
the last tuple weren't replaced by their actual contents before
being handed to the output plugin. The reassembled toast datums
where disregarded after every
REORDER_BUFFER_CHANGE_(INSERT|UPDATE|DELETE) which is correct for
plain inserts, updates, deletes, but not multi inserts - there we
generate several REORDER_BUFFER_CHANGE_INSERTs for a single
xl_heap_multi_insert record. To solve the problem add a
clear_toast_afterwards boolean to ReorderBufferChange's union member
that's used by modifications. All row changes but multi_inserts
always set that to true, but multi_insert sets it only for the last
change generated. Add a regression test covering decoding of
multi_inserts - there was none at all before. Backpatch to 9.4
where logical decoding was introduced. Bug found by Petr Jelinek.
Noah Misch pushed:
- Don't prematurely free the BufferAccessStrategy in pgstat_heap().
This function continued to use it after heap_endscan() freed it. In
passing, don't explicit create a strategy here. Instead, use the
one created by heap_beginscan_strat(), if any. Back-patch to 9.2,
where use of a BufferAccessStrategy here was introduced.
- Consistently pass an "unsigned char" to ctype.h functions. The
isxdigit() calls relied on undefined behavior. The isascii() call
was well-defined, but our prevailing style is to include the cast.
Back-patch to 9.4, where the isxdigit() calls were introduced.
Bruce Momjian pushed:
- pg_upgrade: update C comments about pg_dumpall. There were some C
comments that hadn't been updated from the switch of using only
pg_dumpall to using pg_dump and pg_dumpall, so update them. Also,
don't bother using --schema-only for pg_dumpall --globals-only.
Backpatch through 9.4
- pg_upgrade: no need to remove "members" files for pre-9.3 upgrades.
Per analysis by Alvaro Herrera. Backpatch through 9.3
- pg_upgrade: preserve database and relation minmxid values. Also
set these values for pre-9.3 old clusters that don't have values to
preserve. Analysis by Alvaro Herrera. Backpatch through 9.3.
Tom Lane pushed:
- Allow empty replacement strings in contrib/unaccent. This is useful
in languages where diacritic signs are represented as separate
characters; it's also one step towards letting unaccent be used for
arbitrary substring substitutions. In passing, improve the user
documentation for unaccent, which was sadly vague about some
important details. Mohammad Alhashash, reviewed by Abhijit
- Allow multi-character source strings in contrib/unaccent. This
could be useful in languages where diacritic signs are represented
as separate characters; more generally it supports using unaccent
dictionaries for substring substitutions beyond narrowly conceived
"diacritic removal". In any case, since the rule-file parser
doesn't complain about multi-character source strings, it behooves
us to do something unsurprising with them.
- Issue a WARNING about invalid rule file format in contrib/unaccent.
We were already issuing a WARNING, albeit only elog not ereport, for
duplicate source strings; so warning rather than just being
stoically silent seems like the best thing to do here. Arguably
both of these complaints should be upgraded to ERRORs, but that
might be more behavioral change than people want. Note: the faulty
line is already printed via an errcontext hook, so there's no need
for more information than these messages provide.
- Fix inadequately-sized output buffer in contrib/unaccent. The
output buffer size in unaccent_lexize() was calculated as input
string length times pg_database_encoding_max_length(), which
effectively assumes that replacement strings aren't more than one
character. While that was all that we previously documented it to
support, the code actually has always allowed replacement strings of
arbitrary length; so if you tried to make use of longer strings, you
were at risk of buffer overrun. To fix, use an expansible
StringInfo buffer instead of trying to determine the maximum space
needed a-priori. This would be a security issue if unaccent rules
files could be installed by unprivileged users; but fortunately they
can't, so in the back branches the problem can be labeled as
improper configuration by a superuser. Nonetheless, a memory stomp
isn't a nice way of reacting to improper configuration, so let's
back-patch the fix.
- Improve handling of OOM score adjustment in sample Linux start
script. Per a suggestion from Christoph Berg.
- Remove some useless code in the configure script. Almost ten years
ago, commit e48322a6d6cfce1ec52ab303441df329ddbc04d1 broke the logic
in ACX_PTHREAD by looping through all the possible flags rather than
stopping with the first one that would work. This meant that
$acx_pthread_ok was no longer meaningful after the loop; it would
usually be "no", whether or not we'd found working thread flags.
The reason nobody noticed is that Postgres doesn't actually use any
of the symbols set up by the code after the loop. Rather than
complicate things some more to make it work as designed, let's just
remove all that dead code, and thereby save a few cycles in each
- Refactor CREATE/ALTER DATABASE syntax so options need not be
keywords. Most of the existing option names are keywords anyway,
but we can get rid of LC_COLLATE and LC_CTYPE as keywords known to
the lexer/grammar. This immediately reduces the size of the grammar
tables by about 8KB, and will save more when we add additional
CREATE/ALTER DATABASE options in future. A side effect of the
implementation is that the CONNECTION LIMIT option can now also be
spelled CONNECTION_LIMIT. We choose not to document this, however.
Vik Fearing, based on a suggestion by me; reviewed by Pavel Stehule
- Allow CREATE/ALTER DATABASE to manipulate datistemplate and
datallowconn. Historically these database properties could be
manipulated only by manually updating pg_database, which is
error-prone and only possible for superusers. But there seems no
good reason not to allow database owners to set them for their
databases, so invent CREATE/ALTER DATABASE options to do that.
Adjust a couple of places that were doing it the hard way to use the
commands instead. Vik Fearing, reviewed by Pavel Stehule
- Add some errdetail to checkRuleResultList(). This function wasn't
originally thought to be really user-facing, because converting a
table to a view isn't something we expect people to do manually. So
not all that much effort was spent on the error messages; in
particular, while the code will complain that you got the column
types wrong it won't say exactly what they are. But since we
repurposed the code to also check compatibility of rule RETURNING
lists, it's definitely user-facing. It now seems worthwhile to add
errdetail messages showing exactly what the conflict is when there's
a mismatch of column names or types. This is prompted by bug #10836
from Matthias Raffelsieper, which might have been forestalled if the
error message had reported the wrong column type as being "record".
Back-patch to 9.4, but not into older branches where the set of
translatable error strings is supposed to be stable.
- Improve support for composite types in PL/Python. Allow PL/Python
functions to return arrays of composite types. Also, fix the
restriction that plpy.prepare/plpy.execute couldn't handle query
parameters or result columns of composite types. In passing, adopt
a saner arrangement for where to release the tupledesc reference
counts acquired via lookup_rowtype_tupdesc. The callers of
PLyObject_ToCompositeDatum were doing the lookups, but then the
releases happened somewhere down inside subroutines of
PLyObject_ToCompositeDatum, which is bizarre and bug-prone. Instead
release in the same function that acquires the refcount. Ed Behn
and Ronan Dunklau, reviewed by Abhijit Menon-Sen
- Redesign API presented by nodeAgg.c for ordered-set and similar
aggregates. The previous design exposed the input and output
ExprContexts of the Agg plan node, but work on grouping sets has
suggested that we'll regret doing that. Instead provide more
narrowly-defined APIs that can be implemented in multiple ways,
namely a way to get a short-term memory context and a way to
register an aggregate shutdown callback. Back-patch to 9.4 where
the bad APIs were introduced, since we don't want third-party code
using these APIs and then having to change in 9.5. Andrew Gierth
- Don't cache per-group context across the whole query in
orderedsetaggs.c. Although nodeAgg.c currently uses the same
per-group memory context for all groups of a query, that might
change in future. Avoid assuming it. This costs us an extra
AggCheckCallContext() call per group, but that's pretty cheap and is
probably good from a safety standpoint anyway. Back-patch to 9.4 in
case any third-party code copies this logic. Andrew Gierth
Robert Haas pushed:
- Avoid copying index tuples when building an index. The previous
code, perhaps out of concern for avoid memory leaks, formed the
tuple in one memory context and then copied it to another memory
context. However, this doesn't appear to be necessary, since
index_form_tuple and the functions it calls take precautions against
leaking memory. In my testing, building the tuple directly inside
the sort context shaves several percent off the index build time.
Rearrange things so we do that. Patch by me. Review by Amit
Kapila, Tom Lane, Andres Freund.
- Remove swpb-based spinlock implementation for ARMv5 and earlier.
Per recent analysis by Andres Freund, this implementation is in fact
unsafe, because ARMv5 has weak memory ordering, which means tha the
CPU could move loads or stores across the volatile store performed
by the default S_UNLOCK. We could try to fix this, but have no
ARMv5 hardware to test on, so removing support seems better. We can
still support ARMv5 systems on GCC versions new enough to have
built-in atomics support for this platform, and can also re-add
support for the old way if someone has hardware that can be used to
test a fix. However, since the requirement to use a relatively-new
GCC hasn't been an issue for ARMv6 or ARMv7, which lack the swpb
instruction altogether, perhaps it won't be an issue for ARMv5
Fujii Masao pushed:
- Prevent psql from issuing BEGIN before ALTER SYSTEM when AUTOCOMMIT
is off. The autocommit-off mode works by issuing an implicit BEGIN
just before any command that is not already in a transaction block
and is not itself a BEGIN or other transaction-control command, nor
a command that cannot be executed inside a transaction block. This
commit prevents psql from issuing such an implicit BEGIN before
ALTER SYSTEM because it's not allowed inside a transaction block.
Backpatch to 9.4 where ALTER SYSTEM was added. Report by Feike
- Split out the description of page-level lock as new subsection in
document. Michael Banck
- Refactor pg_receivexlog main loop code, for readability. Previously
the source codes for receiving the data and for polling the socket
were included in pg_receivexlog main loop. This commit splits out
them as separate functions. This is useful for improving the
readability of main loop code and making the future
pg_receivexlog-related patch simpler.
- Fix double-free bug of WAL streaming buffer in pg_receivexlog. This
bug was introduced while refactoring in commit 74cbe96.
Kevin Grittner pushed:
- Smooth reporting of commit/rollback statistics. If a connection
committed or rolled back any transactions within a
PGSTAT_STAT_INTERVAL pacing interval without accessing any tables,
the reporting of those statistics would be held up until the
connection closed or until it ended a PGSTAT_STAT_INTERVAL interval
in which it had accessed a table. This could result in under-
reporting of transactions for an extended period, followed by a
spike in reported transactions. While this is arguably a bug, the
impact is minimal, primarily affecting, and being affected by,
monitoring software. It might cause more confusion than benefit to
change the existing behavior in released stable branches, so apply
only to master and the 9.4 beta. Gurjeet Singh, with review and
editing by Kevin Grittner, incorporating suggested changes from
Abhijit Menon-Sen and Tom Lane.
- Remove dead typeStruct variable from plpy_spi.c. Left behind by
Peter Eisentraut pushed:
- Use a separate temporary directory for the Unix-domain socket.
Creating the Unix-domain socket in the build directory can run into
name-length limitations. Therefore, create the socket file in the
default temporary directory of the operating system. Keep the
temporary data directory etc. in the build tree.
- Support vpath builds in TAP tests
== Rejected Patches (for now) ==
No one was disappointed this week
== Pending Patches ==
Noah Misch, Kyotaro HORIGUCHI and Etsuro Fujita traded patches around
allowing foreign tables to be part of table inheritance hierarchies.
Andres Freund sent in another revision of a patch to add a
cluster_name GUC, which controls whether same is visible in ps output.
Abhijit Menon-Sen and Marti Raudsepp traded patches to add a --stats
option to pg_xlogdump.
Pavel Stehule sent in two more revisions of a patch to allow psql to
show only failed queries.
Michael Paquier sent in another revision of a patch to extend MSVC
scripts to support --with-extra-version.
Heikki Linnakangas sent in another revision of a patch to change the
WAL format and API.
Michael Paquier and Ronan Dunklau traded patches for IMPORT FOREIGN
Tomas Vondra sent in three more revisions of a patch to make hash
buckets grow appropriately.
Kyotaro HORIGUCHI sent in a patch to break socket-blocking on
Kevin Grittner sent in another revision of a patch to add a C
extension to test delta relations in AFTER triggers.
Robert Haas and Andres Freund traded patches to simulate memory
barriers on platforms where they are not available.
Gurjeet Singh sent in another revision of a patch to send transaction
commit/rollback stats to the stats collector unconditionally.
Kyotaro HORIGUCHI sent in a patch to correct the documentation of
ALTER USER SET local_preload_libraries.
Ronan Dunklau sent in another revision of a patch to improve the
functionality of arrays of composite types returned from PL/Python.
Dilip Kumar sent in three more revisions of a patch to allow vacuumdb
to use multiple cores in parallel.
Michael Paquier sent two flocks of patches to fix an issue in WAL
Fujii Masao sent in another revision of a patch to make
log_disconnections PGC_SUSET rather than PGC_SIGHUP.
Michael Banck sent in a patch to add an additional documentation
subsection for page-level locks in the explicit-locking section.
Ian Lawrence Barwick sent in another revision of a patch to add
a "RETURNING PRIMARY KEY" syntax extension to DML.
Fabien COELHO sent in two more revisions of a patch to add a Gaussian
distribution to pgbench.
Peter Geoghegan sent in another revision of a patch to do better at
HINTing an appropriate column within errorMissingColumn().
Tomas Vondra sent in another revision of a patch to tweak
Tatsuo Ishii sent in another revision of a patch to add a
Abhijit Menon-Sen sent in two more revisions of a patch to introduce
Rahila Syed sent in two more revisions of a patch to allow various
compression algorithms for full page writes.
Andrew (RhodiumToad) Gierth sent in two patch sets intended to be
infrastructure for GROUPING SETS.
Furuya Osamu sent in another revision of a patch add a synchronous
mode to pg_receivexlog.
Thomas Munro sent in a PoC patch to enable DISTINCT with btree skip
scan (a.k.a. "loose index scan").
David Rowley sent in another revision of a patch to allow NOT IN to
use ANTI joins.
David Rowley sent in another revision of a patch to allow subquery
LEFT JOIN removal where that would produce correct results.
Andrew (RhodiumToad) Gierth sent in two revisions of a patch to fix a
performance regression related to ScalarArrayOpExpr.
Craig Ringer sent in a patch to improve bytea error messages.
Emre Hasegeli sent in another revision of a patch to add selectivity
estimation for inet operators.
SAWADA Masahiko sent in another revision of a patch to add line number
as prompt option to psql.
Jeff Davis sent in another revision of a patch to make it possible for
the LAG and LEAD window functions to ignore nulls.
Sent via pgsql-announce mailing list (pgsql-announce <at> postgresql.org)
To make changes to your subscription: