Frederik Ramm | 23 Jan 23:02 2013

Question regarding the replication file structure


    I'm toying with the idea of offering regionalised diffs - i.e. a 
series of daily diffs for every regional extract that has to offer. To make it easy for consumers to 
keep their extracts up to date, I thought about making an Osmosis-style 
directory for each extract, e.g. something like 

or so. Just to be safe: What are the conventions that I will have to 
follow so that this works seamlessly with existing clients? Simply have 
a xxx.osc.gz and matching xxx.state.txt in the leaf directory, count 
from 000 to 999 then wrap to the next directory, and have the most 
recent state.txt file at the root directory as well - anything else?

If the frequency wasn't exactly daily - if, say, because of some sort of 
glitch there was extract for one day and therefore the diff is missing, 
or if there were two extracts in one day - would that matter?



Frederik Ramm  ##  eMail frederik@...  ##  N49°00'09" E008°23'33"
Paul Norman | 22 Jan 11:19 2013

Non-standard pgsnapshot indexes

I've talked in other places about the non-standard indexes that I have on my
pgsnapshot database, but I don't believe I've ever produced a full listing.
I believe the following are all the non-standard indexes I have, with the
size and applicable comments in []
On nodes:

btree (changeset_id) [37GB, DWG stuff tends to a lot of changeset queries]

gist (geom, tags) [153GB,

gin (tags) [24GB, xapi]

btree (array_length(akeys(tags), 1)) WHERE array_length(akeys(tags), 1) > 10
[92MB, for finding weirdly tagged stuff]

On ways: 

btree (changeset_id) [5.9GB]

btree ((tags -> 'name'::text) text_pattern_ops) WHERE tags ? 'name'::text
[1.3GB, for running tags -> 'name' LIKE queries as well as potentially
quicker name queries]

btree ((tags -> 'name_1'::text) text_pattern_ops) WHERE tags ?
'name_1'::text [49MB]

btree ((tags -> 'name_2'::text) text_pattern_ops) WHERE tags ?
(Continue reading)

Paul Norman | 22 Jan 10:53 2013

pgsnapshot composite index results

I frequently use my pgsnapshot database for unusual purposes and end up
running non-standard queries.

The standard indexes for pgsnapshot nodes include a GiST index on geom.
Another common index suggested by the jxapi installation instructions[1] is
a GIN index on tags.

These indexes work well when you have a query that is highly selective
spatially or against the tags but are frequently not ideal against queries
combining a medium selective spatial condition with a medium selective tag

While working on addressmerge[2] I encountered a situation where the query
SELECT * FROM local_all; was quicker than SELECT * FROM local_all WHERE tags
? 'addr:housenumber'; local_all was a view of the nodes, ways and
multipolygons[3] in the local area. The speed difference was caused by a
non-optimal query plan of a query of the form SELECT * FROM nodes WHERE
st_intersects (geom,'my_geom'::geometry) AND tags ? 'addr:housenumber';
where my_geom was the EWKT for a polygon covering the area of interest.

The query plan for the first query involved an index scan of the geom gist
index. The second involved a bitmap and of the geom gist and tags gin
indexes. Unfortunately, due to the limitations of hstore statistics this was
likely not the optimal plan. An exploration of options in #postgresql lead
to the discussion of a composite gist index on (geom, tags) as an
alternative indexing strategy, which is what this message is about (after
this rather lengthy preamble.)

A composite index would be created with a statement like CREATE INDEX [
CONCURRENTLY ] ON nodes USING gist (geom, tags); This index can benefit
(Continue reading)

Brian Cavagnolo | 17 Jan 22:35 2013

cleaning up osmosis temp files?

Is there a recommended way to clean up the temp files left behind by
osmosis?  I've been just poking around the /tmp directory blowing away
the copy* and nodeLocation* files.

Nicolas Colomer | 11 Jan 23:33 2013

Osmosis plugin development and integration tests

Hi Osmosis community!

I have a question about plugin development: I am currently working on a
plugin and reach the point where I need to make integration test on small
OSM data sets. Such integration tests will allow me to check the plugin
behavior with more complex data than mocked entities.

By integration test, I mean start Osmosis programmatically in a JUnit test
case, somehow declare my plugin, and finally crunch and process OSM files. I
saw in the Osmosis source code that such tests exist: the best example I
found is

So I tried to reproduce these tests but I face some class loading issue
(always the same Exception, even for simple commands that involve only xml
reading / writing):

java.lang.NoClassDefFoundError: org/java/plugin/PluginLifecycleException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.lang.reflect.Method.invoke(
(Continue reading)

l m | 9 Jan 10:29 2013

Using osmosis as a library


I'm trying to use osmosis (xml-reader) as a library in my android application.

I also need to use an external plugin together with osmosis.

I tried to add both the plugin and osmosis-xml as libraries but it can't be compiled because of duplicate osmosis-plugins.conf files

How is to proper way of doing that, if it is even possible?
osmosis-dev mailing list
Toby Murray | 6 Jan 00:57 2013

Performance of pgsnapshot replication

While waiting for a recent planet import to catch up using minutely
diffs I started wondering what the slow parts of minutely processing
were. So I took a look at my postgres log which is set to record slow
queries. Turns out, the most frequent slow query during diff
processing is the one that updates a way's linestring after a node is
modified. [1] Sometimes it takes 10s of seconds. I assume this is only
on very large ways and/or nodes that belong to many ways. On some
random way with 100 nodes the query took ~300ms for me. A node that
was a member of two ways took ~600ms.

The one that does the same when a way is modified is also in the slow
query log quite frequently. [2]

But this got me to thinking... this query is executed at the node
level. If I am reading this right, this will lead to a LOT of
unnecessary linestring updates. For example if I am working on TIGER
fixup and move 50 nodes in a way and then add another 10, the way is
going to have its linestring updated 51 times while processing a
single diff. One time each when the 50 existing nodes are moved and
then once more when the 10 new nodes are added to the way. Maybe not
in that order.

This query is also executed on nodes that aren't members of any way.
Executing the query for this type of way took me about 150ms.
Unfortunately diffs don't tell us what ways nodes are a member of so
executing the query on unconnected nodes seems unavoidable.

The one place where I think an improvement might be possible is the
"linestring updated 51 times in one diff" problem. My initial thought
would be to make the "action" table required for diff consumption.
Then instead of updating the linestring in Node/WayDao, wait until the
diff processing is complete and issue a single query to update all
ways affected in the diff based on what is in the action table. Maybe
something in ChangeWriter.complete() [3] right before it truncates the
action table. This is somewhat similar to what osm2pgsql does with its
"pending" tables although osm2pgsql actually constructs linestrings in
code using cached node locations so it is a little bit different.

Here is an EXPLAIN ANALYZE for the node query: (might need to click on the "raw" option
to avoid wrapping)
It seems like there is potential benefit from grouping updates so some
of those index scans only happen once per diff instead of once per

The first question that comes to mind is what is the overhead of using
the action table and would it eclipse any gains achieved by this?

Has anyone else given this subject any thought? Is it even worth
pursuing further?


Jeff Meyer | 3 Jan 21:02 2013

Osmosis error - node not present in table "current_nodes" - no bbox

After executing:
$ osmosis --rb file="/home/historic/data/planet-natural.pbf" --wd user="openstreetmap" password="openstreetmap" database="osm" validateSchemaVersion=no

I'm running into the following error:
(full excerpt below)
Caused by: org.postgresql.util.PSQLException: ERROR: insert or update on table "current_way_nodes" violates foreign key constraint "current_way_nodes_node_id_fkey"
  Detail: Key (node_id)=(25918312) is not present in table "current_nodes".

Most of what I can find about this on the intrawebs indicates that this is typically a bbox-related issue, but I haven't used any bboxes to create my planet-natural file. (see steps below)

Any guesses / suggestions / obvious things I'm missing?

Here's the sequence I've used to generate the planet-natural.pbf file - my goal is to cut down the size of the file and retain only big waterways and natural features - just ways and relations, no nodes. I believe osmfilter should keep nodes required by ways and relations, but could be wrong there.

$ osmconvert planet-121226.osm.pbf -o=planet.o5m
$ ./osmfilter planet.o5m --drop-tags="created_by= nhd:*= yf:*= canvec:*= gnis:*= NHD:*= KSJ2:*= massgis:*=" --drop-author --drop="natural=wood waterway=drain or waterway=ditch or waterway=stream or leisure=park" -o=planet-natural-temp1.o5m
$ ./osmfilter planet-natural-temp1.o5m --keep= --keep-ways="natural= or waterway=" --keep-relations="natural= or waterway=" -o=planet-natural-temp2.o5m
$ ./osmconvert planet-natural-temp2.o5m -o=planet-natural.pbf

Thanks, Jeff

Osmosis error:

SEVERE: Thread for task 1-rb failed
org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to load current way nodes.
at org.openstreetmap.osmosis.apidb.v0_6.ApidbWriter.populateCurrentWays(
at org.openstreetmap.osmosis.apidb.v0_6.ApidbWriter.populateCurrentTables(
at org.openstreetmap.osmosis.apidb.v0_6.ApidbWriter.complete(
at crosby.binary.osmosis.OsmosisBinaryParser.complete(
at crosby.binary.file.BlockInputStream.process(
Caused by: org.postgresql.util.PSQLException: ERROR: insert or update on table "current_way_nodes" violates foreign key constraint "current_way_nodes_node_id_fkey"
  Detail: Key (node_id)=(25918312) is not present in table "current_nodes".
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(
at org.postgresql.core.v3.QueryExecutorImpl.processResults(
at org.postgresql.core.v3.QueryExecutorImpl.execute(
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(
at org.openstreetmap.osmosis.apidb.v0_6.ApidbWriter.populateCurrentWays(
... 6 more
Jan 3, 2013 7:41:55 PM org.openstreetmap.osmosis.core.Osmosis main
SEVERE: Execution aborted.
org.openstreetmap.osmosis.core.OsmosisRuntimeException: One or more tasks failed.
at org.openstreetmap.osmosis.core.pipeline.common.Pipeline.waitForCompletion(
at org.openstreetmap.osmosis.core.Osmosis.main(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(
at org.codehaus.plexus.classworlds.launcher.Launcher.main(
at org.codehaus.classworlds.Launcher.main(

Jeff Meyer
Global World History Atlas

osmosis-dev mailing list
Jeff Meyer | 31 Dec 18:25 2012

production site disk use question

Are there any rules of thumb about Postgres disk requirements when importing osm data?

For example, I just tried importing a 1.7GB planet-reduced.pbf into my rails port osm db and it failed after ~30 hrs because I ran out of disk space after it had eaten up 50GB of disk. Bad planning on my part, but how should I budget for this?

e.g. a 30GB .osm file will probably take 2x that when the import is finished?


Jeff Meyer
Global World History Atlas

osmosis-dev mailing list
Jeff Meyer | 29 Dec 00:12 2012

Osmosis use q: points showing up in results after --used-node

Hi - 

I'm seeing individual nodes where I don't think I should be seeing any in this file:

It was generated by:

$ osmosis --rb planet-120926.osm.pbf --bb left=-122.53 bottom=47.34 right=-122.06 top=47.79 top=49.5138 --write-xml seattle.osm

$ osmosis --rx seattle.osm --tf accept-ways natural=* --used-node --wx seattle-natural.osm

There seem to be plenty of non-natural single nodes - bus stations, traffic lights, turn restrictions, etc.

Am I doing something wrong?


Jeff Meyer
Global World History Atlas

osmosis-dev mailing list
Eric Fernandez | 13 Dec 13:18 2012

deadlock when extracting bounds with 0.41 but not 0.40.1


This is a follow-up to the thread here:

I am experiencing the same hanging issue with osmosis 0.41 with the GB
map from , which I did not have with osmosis 0.40.1.

The command I use is the same one (as described in

./osmosis-0.41/bin/osmosis --read-pbf file=great_britain.osm.pbf
outPipe.0=1 --tee 2 inPipe.0=1 outPipe.0=2 outPipe.1=3 --buffer
inPipe.0=3 outPipe.0=4 --buffer inPipe.0=2 outPipe.0=5 --tag-filter
accept-relations boundary=administrative,postal_code inPipe.0=4
outPipe.0=6 --used-way inPipe.0=6 outPipe.0=7 --tag-filter
reject-relations inPipe.0=5 outPipe.0=8 --tag-filter accept-ways
boundary=administrative,postal_code inPipe.0=8 outPipe.0=9 --used-node
inPipe.0=9 outPipe.0=10 --used-node inPipe.0=7 outPipe.0=11 --merge
inPipe.0=10 inPipe.1=11 outPipe.0=12 --write-pbf
file=great_britain-boundaries.osm.pbf omitmetadata=true
compress=deflate inPipe.0=12

In his answer, Andrew Byrd explains there may be a deadlock problem
with the inputs for the merge, but this command was working with
osmosis-0.40.1, and normally is dealing with this issue already. Is
there any reason that the new 0.41 version leads to the issue?

Many thanks in advance,