== PostgreSQL Weekly News - January 24 2016 ==
David Fetter <david <at> fetter.org>
2016-01-25 05:30:04 GMT
== PostgreSQL Weekly News - January 24 2016 ==
The 8th PostgreSQL Session will be held on April 6th, 2016, in Lyon,
France. The CfP is open until February 29, 2016 at call-for-paper
<AT> postgresql-sessions <DOT> org.
== PostgreSQL Product News ==
pgBadger 7.3, a parallel PostgreSQL log analyzer written in Perl, released:
PyGreSQL 4.2, a Python driver for PostgreSQL, released.
== PostgreSQL Jobs for January ==
== PostgreSQL Local ==
FOSDEM PGDay is a one day conference that will be held ahead of FOSDEM in
Brussels, Belgium, on Jan 29th, 2016. Registration is still open.
Prague PostgreSQL Developer Day 2016 (P2D2 2016) is a two-day conference
that will be held on February 17-18 2016 in Prague, Czech Republic.
Czech language web site below:
The annual Indian PGday will be held in Bengaluru, Karnataka, India on
February 26, 2016.
The first pan-Asian PostgreSQL conference will be held March 16-17,
2016 in Singapore.
Nordic PGDay 2016 is a one day one track conference which will be held in
Helsinki, Finland, on March 17, 2016. Registration is still open.
PGConf US 2016 will take place April 18-20, 2016 in NYC. The CfP is
open until January 31st, 2016, 11:59pm EST.
LinuxFest Northwest will take place April 23-24, 2016 at Bellingham
Technical College in Bellingham, Washington, USA. The CfP is now
FOSS4G NA, will be held May 2-5, 2016 in Raleigh, North Carolina.
The CfP is open.
PGCon 2016 will be held May 17-21, 2016 in Ottawa.
This year's Swiss PGDay will be held on June 24, 2016 at the
University of Applied Sciences in Rapperswil (Switzerland).
The CfP is open.
== PostgreSQL in the News ==
Planet PostgreSQL: http://planet.postgresql.org/
PostgreSQL Weekly News is brought to you this week by David Fetter
Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david <at> fetter.org, German language
to pwn <at> pgug.de, Italian language to pwn <at> itpug.org. Spanish language
to pwn <at> arpug.com.ar.
== Applied Patches ==
Tom Lane pushed:
- Re-pgindent a few files. In preparation for landing index AM
- Restructure index access method API to hide most of it at the C
level. This patch reduces pg_am to just two columns, a name and a
handler function. All the data formerly obtained from pg_am is now
provided in a C struct returned by the handler function. This is
similar to the designs we've adopted for FDWs and tablesample
methods. There are multiple advantages. For one, the index AM's
support functions are now simple C functions, making them faster to
call and much less error-prone, since the C compiler can now check
function signatures. For another, this will make it far more
practical to define index access methods in installable extensions.
A disadvantage is that SQL-level code can no longer see attributes
of index AMs; in particular, some of the crosschecks in the
opr_sanity regression test are no longer possible from SQL. We've
addressed that by adding a facility for the index AM to perform such
checks instead. (Much more could be done in that line, but for now
we're content if the amvalidate functions more or less replace what
opr_sanity used to do.) We might also want to expose some sort of
reporting functionality, but this patch doesn't do that. Alexander
Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized
on by me.
- Add explicit cast to amcostestimate call. My compiler doesn't
complain here, but David Rowley's does ...
- Fix assorted inconsistencies in GiST opclass support function
declarations. The conventions specified by the GiST SGML
documentation were widely ignored. For example, the strategy-number
argument for "consistent" and "distance" functions is specified to
be a smallint, but most of the built-in support functions declared
it as an integer, and for that matter the core code passed it using
Int32GetDatum not Int16GetDatum. None of that makes any real
difference at runtime, but it's quite confusing for newcomers to the
code, and it makes it very hard to write an amvalidate() function
that checks support function signatures. So let's try to instill
some consistency here. Another similar issue is that the "query"
argument is not of a single well-defined type, but could have
different types depending on the strategy (corresponding to search
operators with different righthand-side argument types). Some of
the functions threw up their hands and declared the query argument
as being of "internal" type, which surely isn't right ("any" would
have been more appropriate); but the majority position seemed to be
to declare it as being of the indexed data type, corresponding to a
search operator with both input types the same. So I've specified a
convention that that's what to do always. Also, the result of the
"union" support function actually must be of the index's storage
type, but the documentation suggested declaring it to return
"internal", and some of the functions followed that. Standardize on
telling the truth, instead. Similarly, standardize on declaring the
"same" function's inputs as being of the storage type, not
"internal". Also, somebody had forgotten to add the "recheck"
argument to both the documentation of the "distance" support
function and all of their SQL declarations, even though the C code
was happily using that argument. Clean that up too. Fix up some
other omissions in the docs too, such as documenting that union's
second input argument is vestigial. So far as the errors in core
function declarations go, we can just fix pg_proc.h and bump
catversion. Adjusting the erroneous declarations in contrib modules
is more debatable: in principle any change in those scripts should
involve an extension version bump, which is a pain. However, since
these changes are purely cosmetic and make no functional difference,
I think we can get away without doing that.
- Fix assorted inconsistencies in GIN opclass support function
declarations. GIN had some minor issues too, mostly using
"internal" where something else would be more appropriate. I went
with the same approach as in 9ff60273e35cad6e, namely preferring the
opclass' indexed datatype for arguments that receive an operator RHS
value, even if that's not necessarily what they really are. Again,
this is with an eye to having a uniform rule for ginvalidate() to
check support function signatures.
- Add defenses against putting expanded objects into Const nodes.
Putting a reference to an expanded-format value into a Const node
would be a bad idea for a couple of reasons. It'd be possible for
the supposedly immutable Const to change value, if something
modified the referenced variable ... in fact, if the Const's
reference were R/W, any function that has the Const as argument
might itself change it at runtime. Also, because datumIsEqual() is
pretty simplistic, the Const might fail to compare equal to other
Consts that it should compare equal to, notably including copies of
itself. This could lead to unexpected planner behavior, such as
"could not find pathkey item to sort" errors or inferior plans. I
have not been able to find any way to get an expanded value into a
Const within the existing core code; but Paul Ramsey was able to
trigger the problem by writing a datatype input function that
returns an expanded value. The best fix seems to be to establish a
rule that varlena values being placed into Const nodes should be
passed through pg_detoast_datum(). That will do nothing (and cost
little) in normal cases, but it will flatten expanded values and
thereby avoid the above problems. Also, it will convert
short-header or compressed values into canonical format, which will
avoid possible unexpected lack-of-equality issues for those cases
too. And it provides a last-ditch defense against putting a toasted
value into a Const, which we already knew was dangerous, cf commit
2b0c86b66563cf2f. (In the light of this discussion, I'm no longer
sure that that commit provided 100% protection against such cases,
but this fix should do it.) The test added in commit
65c3d05e18e7c530 to catch datatype input functions with unstable
results would fail for functions that returned expanded values; but
it seems a bit uncharitable to deem a result unstable just because
it's expressed in expanded form, so revise the coding so that we
check for bitwise equality only after applying pg_detoast_datum().
That's a sufficient condition anyway given the new rule about
detoasting when forming a Const. Back-patch to 9.5 where the
expanded-object facility was added. It's possible that this should
go back further; but in the absence of clear evidence that there's
any live bug in older branches, I'll refrain for now.
- Suppress compiler warning. Given the limited range of i, these
shifts should not cause any problem, but that apparently doesn't
stop some compilers from whining about them. David Rowley
- Improve index AMs' opclass validation procedures. The amvalidate
functions added in commit 65c5fcd353a859da were on the crude side.
Improve them in a few ways: * Perform signature checking for
operators and support functions. * Apply more thorough checks for
missing operators and functions, where possible. * Instead of
reporting problems as ERRORs, report most problems as INFO messages
and make the amvalidate function return FALSE. This allows more
than one problem to be discovered per run. * Report object names
rather than OIDs, and work a bit harder on making the messages
understandable. Also, remove a few more opr_sanity regression test
queries that are now superseded by the amvalidate checks.
- Make extract() do something more reasonable with infinite datetimes.
Historically, extract() just returned zero for any case involving an
infinite timestamp[tz] input; even cases in which the unit name was
invalid. This is not very sensible. Instead, return infinity or
-infinity as appropriate when the requested field is one that is
monotonically increasing (e.g, year, epoch), or NULL when it is not
(e.g., day, hour). Also, throw the expected errors for bad unit
names. BACKWARDS INCOMPATIBLE CHANGE Vitaly Burovoy, reviewed by
- Remove new coupling between NAMEDATALEN and MAX_LEVENSHTEIN_STRLEN.
Commit e529cd4ffa605c6f introduced an Assert requiring NAMEDATALEN
to be less than MAX_LEVENSHTEIN_STRLEN, which has been 255 for a
long time. Since up to that instant we had always allowed
NAMEDATALEN to be substantially more than that, this was
ill-advised. It's debatable whether we need MAX_LEVENSHTEIN_STRLEN
at all (versus putting a CHECK_FOR_INTERRUPTS into the loop), or
whether it has to be so tight; but this patch takes the narrower
approach of just not applying the MAX_LEVENSHTEIN_STRLEN limit to
calls from the parser. Trusting the parser for this seems
reasonable, first because the strings are limited to NAMEDATALEN
which is unlikely to be hugely more than 256, and second because the
maximum distance is tightly constrained by MAX_FUZZY_DISTANCE
(though we'd forgotten to make use of that limit in one place).
That means the cost is not really O(mn) but more like O(max(m,n)).
Relaxing the limit for user-supplied calls is left for future
research; given the lack of complaints to date, it doesn't seem very
high priority. In passing, fix confusion between lengths-in-bytes
and lengths-in-chars in comments and error messages. Per gripe from
Kevin Day; solution suggested by Robert Haas. Back-patch to 9.5
where the unwanted restriction was introduced.
- Improve levenshtein() docs. Fix chars-vs-bytes confusion here too.
Improve poor grammar and markup.
- Improve cross-platform consistency of Inf/NaN handling in trig
functions. Ensure that the trig functions return NaN for NaN input
regardless of what the underlying C library functions might do.
Also ensure that an error is thrown for Inf (or otherwise
out-of-range) input, except for atan/atan2 which should accept it.
All these behaviors should now conform to the POSIX spec;
previously, all our popular platforms deviated from that in one case
or another. The main remaining platform dependency here is whether
the C library might choose to throw a domain error for sin/cos/tan
inputs that are large but less than infinity. (Doing so is not
unreasonable, since once a single unit-in-the-last-place exceeds PI,
there can be no significance at all in the result; however there
doesn't seem to be any suggestion in POSIX that such an error is
allowed.) We will report such errors if they are reported via
"errno", but not if they are reported via "fetestexcept" which is
the other mechanism sanctioned by POSIX. Some preliminary
experiments with fetestexcept indicated that it might also report
errors we could do without, such as complaining about underflow at
an unreasonably large threshold. So let's skip that complexity for
now. Dean Rasheed, reviewed by Michael Paquier
- Add trigonometric functions that work in degrees. The
implementations go to some lengths to deliver exact results for
values where an exact result can be expected, such as sind(30) = 0.5
exactly. Dean Rasheed, reviewed by Michael Paquier
- Adjust degree-based trig functions for more portability. The
buildfarm isn't very happy with the results of commit
e1bd684a34c11139. To try to get the expected exact results
everywhere: * Replace M_PI / 180 subexpressions with a precomputed
constant, so that the compiler can't decide to rearrange that
division with an adjacent operation. Hopefully this will fix
failures to get exactly 0.5 from sind(30) and cosd(60). * Add
scaling to ensure that tand(45) and cotd(45) give exactly 1; there
was nothing particularly guaranteeing that before. * Replace minus
zero by zero when tand() or cotd() would output that; many machines
did so for tand(180) and cotd(270), but not all. We could
alternatively deem both results valid, but that doesn't seem likely
to be what users will want.
- Further adjust degree-based trig functions for more portability.
The last round didn't do it. Per Noah Misch, the problem on at
least some machines is that the compiler pre-evaluates trig
functions having constant arguments using code slightly different
from what will be used at runtime. Therefore, we must prevent the
compiler from seeing constant arguments to any of the libm trig
functions used in this code. The method used here might still fail
if init_degree_constants() gets inlined into the call sites. That
probably won't happen given the large number of call sites; but if
it does, we could probably fix it by making init_degree_constants()
non-static. I'll avoid that till proven necessary, though.
- Still further adjust degree-based trig functions for more
portability. Indeed, the non-static declaration foreseen in my
previous commit message is necessary. Per Noah Misch.
- Yet further adjust degree-based trig functions for more portability.
Buildfarm member cockatiel is still saying that cosd(60) isn't 0.5.
What seems likely is that the subexpression (1.0 - cos(x)) isn't
being rounded to double width before more arithmetic is done on it,
so force that by storing it into a variable.
Tatsuo Ishii pushed:
- Fix typo. Reported by KOIZUMI Satoru.
Andrew Dunstan pushed:
- Remove Cygwin-specific code from pg_ctl This code has been there for
a long time, but it's never really been needed. Cygwin has its own
utility for registering, unregistering, stopping and starting
Windows services, and that's what's used in the Cygwin postgres
packages. So now pg_ctl for Cygwin looks like it is for any Unix
platform. Michael Paquier and me
Álvaro Herrera pushed:
- Add two HyperLogLog functions. New functions initHyperLogLogError()
and freeHyperLogLog() simplify using this module from elsewhere.
Author: Tomáš Vondra. Review: Peter Geoghegan
- PostgresNode: Add names to nodes. This makes the log files easier
to follow when investigating a test failure. Author: Michael
Paquier. Review: Noah Misch
- pg_dump: Fix quoting of domain constraint names. The original code
was adding double quotes to an already-quoted identifier, leading to
nonsensical results. Remove the quoting call. I introduced the
broken code in 7eca575d1c of 9.5 era, so backpatch to 9.5. Report
and patch by Elvis Pranskevichus Reviewed by Michael Paquier
Bruce Momjian pushed:
- Properly install dynloader.h on MSVC builds. This will enable
PL/Java to be cleanly compiled, as dynloader.h is a requirement.
Report by Chapman Flack Patch by Michael Paquier Backpatch through
Robert Haas pushed:
- Support multi-stage aggregation. Aggregate nodes now have two new
modes: a "partial" mode where they output the unfinalized transition
state, and a "finalize" mode where they accept unfinalized
transition states rather than individual values as input. These new
modes are not used anywhere yet, but they will be necessary for
parallel aggregation. The infrastructure also figures to be useful
for cases where we want to aggregate local data and remote data via
the FDW interface, and want to bring back partial aggregates from
the remote side that can then be combined with locally generated
partial aggregates to produce the final value. It may also be
useful even when neither FDWs nor parallelism are in play, as
explained in the comments in nodeAgg.c. David Rowley and Simon
Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi,
- Support parallel joins, and make related improvements. The core
innovation of this patch is the introduction of the concept of a
partial path; that is, a path which if executed in parallel will
generate a subset of the output rows in each process. Gathering a
partial path produces an ordinary (complete) path. This allows us
to generate paths for parallel joins by joining a partial path for
one side (which at the baserel level is currently always a Partial
Seq Scan) to an ordinary path on the other side. This is subject to
various restrictions at present, especially that this strategy seems
unlikely to be sensible for merge joins, so only nested loops and
hash joins paths are generated. This also allows an Append node to
be pushed below a Gather node in the case of a partitioned table.
Testing revealed that early versions of this patch made poor
decisions in some cases, which turned out to be caused by the fact
that the original cost model for Parallel Seq Scan wasn't very good.
So this patch tries to make some modest improvements in that area.
There is much more to be done in the area of generating good
parallel plans in all cases, but this seems like a useful step
forward. Patch by me, reviewed by Dilip Kumar and Amit Kapila.
Simon Riggs pushed:
- Refactor to create generic WAL page read callback. Previously we
didn’t have a generic WAL page read callback function, surprisingly.
Logical decoding has logical_read_local_xlog_page(), which was
actually generic, so move that to xlogfunc.c and rename to
read_local_xlog_page(). Maintain logical_read_local_xlog_page() so
existing callers still work. As requested by Michael Paquier,
Alvaro Herrera and Andres Freund
- Speedup 2PC by skipping two phase state files in normal path. 2PC
state info is written only to WAL at PREPARE, then read back from
WAL at COMMIT PREPARED/ABORT PREPARED. Prepared transactions that
live past one bufmgr checkpoint cycle will be written to disk in the
same form as previously. Crash recovery path is not altered.
Measured performance gains of 50-100% for short 2PC transactions by
completely avoiding writing files and fsyncing. Other optimizations
still available, further patches in related areas expected. Stas
Kelvich and heavily edited by Simon Riggs Based upon earlier ideas
and patches by Michael Paquier and Heikki Linnakangas, a concrete
example of how Postgres-XC has fed back ideas into PostgreSQL.
Reviewed by Michael Paquier, Jeff Janes and Andres Freund
Performance testing by Jesper Pedersen
- Refactor headers to split out standby defs Jeff Janes
- Correct comment in GetConflictingVirtualXIDs() We use Share lock
because it is safe to do so.
Peter Eisentraut pushed:
- psql: Add tab completion for COPY with query. From: Andreas
Karlsson <andreas <at> proxel.se>
- psql: Improve completion of FDW DDL commands. Add ALTER FOREIGN
DATA WRAPPER -> RENAME TO, ALTER SERVER -> RENAME TO, ALTER SERVER
... VERSION ... -> OPTIONS, CREATE FOREIGN DATA WRAPPER -> OPTIONS ,
CREATE SERVER -> OPTIONS, CREATE|ALTER USER MAPPING -> OPTIONS From:
Andreas Karlsson <andreas <at> proxel.se>
Fujii Masao pushed:
- Remove unused argument from ginInsertCleanup() It's an oversight in
== Rejected Patches (for now) ==
No one was disappointed this week
== Pending Patches ==
Alexander Shulgin sent in a patch to fix some IDENTIFICATION
Dmitry Dolgov sent in another revision of a patch to add array-style
subscripting to JSONB.
Tomas Vondra sent in another revision of a patch to add multivariate
Bruce Momjian and Joe Conway traded patches to expose pg_controldata
and pg_config as functions.
Artur Zakirov sent in two more revisions of a patch to implement fuzzy
substring searching with the pg_trgm extension.
Ashutosh Bapat sent in two more revisions of a patch to add
postgres_fdw join pushdown.
Vitaly Burovoy and Pavel Stěhule traded patches to add a custom
function for converting human readable sizes to bytes.
Dilip Kumar sent in another revision of a patch to move PinBuffer and
UnpinBuffer to atomics.
Thomas Munro sent in another revision of a patch to add causal reads.
SAWADA Masahiko sent in another revision of a patch to allow multiple
synchronous standby servers.
Alexander Shulgin sent in another revision of a patch to add an
extension called pg_logical_slot_stream_relation.
Tatsuo Ishii sent in a patch to fix a too-enthusiastic quoting of
identifiers with the high bit set.
Anastasia Lubennikova sent in two more revisions of a patch to
implement covering unique indexes.
Robert Haas and Etsuro Fujita traded patches to optimize write
operations on the PostgreSQL FDW.
Haribabu Kommi sent in two more revisions of a patch to do aggregation
David Rowley sent in another revision of a patch to serialize internal
Haribabu Kommi sent in two more revisions of a patch to add combine
functions for staged aggregates.
Kyotaro HORIGUCHI sent in another revision of a patch to allow
async-capable nodes to register callbacks to run the node before
ExecProcNode() and provide an example using same.
Etsuro Fujita sent in another revision of a patch to optimize
Daniel Verité sent in another revision of a patch to add a
\crosstabview to psql.
Tomas Vondra and David Rowley traded patches to optimize outer joins
where the outer side is unique.
Petr Jelínek sent in another revision of a patch to enable generic WAL
David Rowley sent in a patch to fix an issue where combining
aggretates didn't work with pg_dump.
Victor Wagner sent in another revision of a patch to implement
failover on the libpq connect level.
Aleksander Alekseev sent in another revision of a patch to optimize
Tomas Vondra sent in another revision of a patch to ensure that more
predictable column statistics are being gathered and used.
Fabien COELHO sent in another revision of a patch to add better
logging, etc. to pgbench.
David Rowley sent in another revision of a patch to remove
functionally dependent GROUP BY columns.
Alexander Korotkov sent in another revision of a patch to add partial
Pavel Stěhule sent in another revision of a patch to add a
Artur Zakirov sent in another revision of a patch to copy regexp_t.
Sent via pgsql-announce mailing list (pgsql-announce <at> postgresql.org)
To make changes to your subscription: