Brett Henderson | 31 Mar 14:27 2013

OSM Binary (ie. PBF) Changes

Hi All,

I've just changed how the PBF support is provided in Osmosis.  Up until now there was a pre-compiled jar called osmpbf.jar checked into the Osmosis source tree.  It was compiled from Scott Crosby's github project here:

The problem with this is that it prevents me from publishing Osmosis to Maven Central because people trying to download Osmosis would be unable to use PBF without also getting a copy of that library.

To get around this I've repackaged the OSM-binary project to build as part of the Osmosis source tree.  The resultant jar is called osmosis-osm-binary, and it lives in a package called org.openstreetmap.osmosis.osmbinary.

I've tried to do this in a way that allows me to keep up to date with the upstream project.  I've forked the original repository on Github, and created an Osmosis branch:

When changes are made to upstream, I should be able to merge master across to my osmosis branch.  The resultant tree is then checked directly into the Osmosis source tree.  On my local machine I actually have both git repositories acting on the same source files so it isn't too painful.  I have the Osmosis source checked out normally (ie. a .git directory at the root), and the osmosis-osm-binary project is also a clone of my forked OSM-binary project (ie. I have a .git directory in there as well).  So far it seems to work well enough.

If the original OSM-binary project ever gets published to Maven Central directly then I can stop these shenanigans and depend on it directly.

If anybody has any questions, issues, suggestions, etc let me know.


PS. I'm getting close to publishing to Maven Central now.  There are no major blockers left that I'm aware of.
osmosis-dev mailing list
Brett Henderson | 30 Mar 12:41 2013

Publishing Osmosis to Maven Central

Hi All,

This may be of interest to some of you.  I've just begun the process of publishing Osmosis artefacts to Maven Central (ie.

My current snapshot build is available here:

For those who are not familiar with Maven Central practices, it is not possible to publish directly to Maven Central itself.  The simplest way is to publish via the OSS Sonatype repository which then gets sync'd with Maven Central.

A few changes have been made to the Osmosis build to allow this to work (I'm still in the process of pushing these changes).  The most noticeable change is that most projects have been renamed to have an "osmosis-" prefix.  If you have an existing Eclipse workspace, you'll need to re-run "gradle eclipse" and re-import your projects.

The main blocker to publishing release artefacts is that I have two dependencies on libraries not available in Maven Central.  These are the scrosby PBF lib, and the BZip2 library which I built manually but which is based on Apache source code.  Both should be fixable by building and publishing them along with the rest of Osmosis, but I need to find time to do so.


osmosis-dev mailing list
Daniel Kerkow | 21 Mar 13:34 2013

Osmosis 0.42 Installation on Ubuntu 12.04

I want to use the following python script that extracts the OSM boundaries and does some additional stuff with them. 

This script also calls osmosis, but needs version 0.42.
The Ubuntu repositories don't provide this  version.

How do I exactly install osmosis to be called from system, not from commandline?

Thanks in advance,

osmosis-dev mailing list
Nicolas Colomer | 12 Mar 10:28 2013

OSM entity processing order

Hi Osmosis community!

When I manipulate an OSM file (compressed or not) using Osmosis, can we assume that entities will systematically be processed in this order: 1.bound, 2.node, 3.way, 4.relation?

This seems logical since the OSM file format guarantees that "blocks come in this order" (see the OSM XML #File format wiki page).

In addition, I reach a post where Brett told:

> This is due to the way Osmosis processing works because it finishes processing nodes before it sees the ways.

I just want to make sure my impression is good :)

Thank you very much!

Best regards,
osmosis-dev mailing list
Brett Henderson | 17 Feb 04:05 2013

Osmosis 0.42 Released

Hi All,

I've just released Osmosis 0.42.  It was easier to create a new release than to continue responding to limitations in 0.41 :-)

From changes.txt:
  • Fix PostgreSQL timestamp bugs in apidb replication logic.
  • Fix replication file merging boundary corrections.  Occurs when catching up after outages.
  • Replication logic correctly honours the max timestamp parameter.
  • Prevent replication file downloader from reading beyond maximum available replication interval.
  • Prevent replication file downloader from stalling if interval is too long.
  • Improve error reporting when an unknown global option is specified.
  • Disable automatic state.txt creation for --read-replication-interval.
  • Add --tag-transform plugin and task.
  • Reduce number of file handles consumed by file-based sorting.
  • Make the default id tracker Dynamic for --used-node and --used-way.
  • Use Gradle for the automated build scripts instead of Ant/Ivy.
  • Fix PostgreSQL ident authentication.
  • Remove obsolete debian build scripts.
  • Eliminate use of deprecated Spring SimpleJdbcTemplate.
  • Improve handling of invalid geometries in --pgsql-xxx tasks.
  • Default keepInvalidWays option on --pgsql-xxx tasks to true.
  • Enable keepInvalidWays functionality for --pgsql-xxx replication.
  • Fix pgsnapshot COPY load script to use ST_ prefix for all PostGIS functions.
    Let me know if you see any issues.


    osmosis-dev mailing list
    Ilya Zverev | 6 Feb 10:05 2013

    32-bit limit in IdTrackers

    Hi! As some of you have read 
    in three days node ids are expected to surpass 2147483647, and this 
    will throw an exception "Cannot represent " + value + " as an integer." 
    It is used in every IdTracker implementation, so id trackers will become 
    This will affect tag and area filters. Regional extracts that are made 
    with osmosis will break. There is a comment at the start of each 
    IdTracker class: "The current implementation only supports 31 bit 
    numbers, but will be enhanced if and when required." I guess, now is the 
    time. Can anybody fix that? There must be a reason why this hasn't done 
    Oliver Schrenk | 4 Feb 15:42 2013

    Eclipse Setup, Missing task types

    Are there more current notes about how to setup Eclipse for osmosis development then the notes in [1] ?
    I know that ant has been deprecated in favor of gradle, so I installed Eclipse Grade Support via [2] and 
    	$ git clone
    	$ cd osmosis
    	$ ./gradlew assemble
    and proceeded to import osmosis' multi-modules using `File > Import > Gradle`. Everything compiles fine.
    But when I try to execute a command like
    	osmosis --read-xml file="bremen.osm.bz2" --write-apidb-0.6 host=""
    database="api06_test" user="osm" password="osm" validateSchemaVersion=no
    using a Run Configuration with `org.openstreetmap.osmosis.core.Osmosis` as the main class
    	--read-xml file="bremen.osm.bz2" --write-apidb-0.6 host="" database="api06_test"
    user="osm" password="osm" validateSchemaVersion=no
    as program arguments I get
    	Feb 04, 2013 3:31:39 PM org.openstreetmap.osmosis.core.Osmosis run
    	INFO: Osmosis Version 0.41-55-gb44b7d7-dirty
    	Feb 04, 2013 3:31:39 PM org.openstreetmap.osmosis.core.Osmosis run
    	INFO: Preparing pipeline.
    	Feb 04, 2013 3:31:39 PM org.openstreetmap.osmosis.core.Osmosis main
    	SEVERE: Execution aborted.
    	org.openstreetmap.osmosis.core.OsmosisRuntimeException: Task type read-xml doesn't exist.
    		at org.openstreetmap.osmosis.core.pipeline.common.TaskManagerFactoryRegister.getInstance(
    		at org.openstreetmap.osmosis.core.pipeline.common.Pipeline.buildTasks(
    		at org.openstreetmap.osmosis.core.pipeline.common.Pipeline.prepare(
    		at org.openstreetmap.osmosis.core.Osmosis.main(
    It doesn't seem to pickup the various tasks. 
    My end goal is to debug write-apidb-0.6 as I'm trying to write data to an unsupported database and run into
    problems with duplicate user entries and want to use Eclipse's Debugger to go through the code.
    Best regards
    OSX 10.8.2
    Java 1.7.0_11-b21
    osmosis 0.41-55-gb44b7d7-dirty
    Toby Murray | 31 Jan 02:47 2013

    Duplicate ways in pgsnapshot database

    Today my minutely replication started failing with a unique constraint
    violation error from postgres. Upon further investigation I found that
    there were *already* two copies of a way in my database. An incoming
    change was trying to modify the way which caused postgres to notice
    the duplication and error out. Basically a "hey wait there are two of
    them. Which one do you want me to modify?" Here is the osmosis output:
    Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key
    value violates unique constraint "pk_ways"
      Detail: Key (id)=(26926573) already exists.
    It was erroring on this way:
    So a few questions immediately come to mind.
    1) How did a duplicate record get into the database? There is
    definitely a primary key constraint on the id column. In this
    particular case it looks like it happened during the initial planet
    import. I did this from the January 2nd pbf file. The two rows are
    identical in every way and the way was last touched (before today's
    edit) in 2009. All constraints are disabled during the \copy operation
    so I can see a duplicate way being able to get in. Although this
    implies that there are either two copies of the way in the planet file
    or a bug in osmosis. I would have thought the primary key constraint
    would have been checked when it was recreated after the \copy
    operation though. Apparently not.
    2) How do I fix this? I believe deleting one of the rows would fix
    this but I can't actually delete only one since *every* column is the
    same. I think it was suggested on #osm-dev that I create a copy of one
    in temp table, delete both and then reinsert the copy. This is
    probably what I will try.
    3) Are there any others? Turns out: yes, there are 4 duplicated ways
    in my database. This may not come through with good formatting but
    here they are:
        id    | version | user_id |       tstamp
     26245218 |      12 |  163673 | 2011-02-06 06:54:10
     26245218 |      13 |  290680 | 2013-01-28 02:37:56
     26709186 |       4 |   64721 | 2008-09-02 04:39:21
     26709186 |       4 |   64721 | 2008-09-02 04:39:21
     26709284 |       4 |   70621 | 2008-10-26 14:06:03
     26709284 |       5 |   64721 | 2013-01-28 02:38:30
     26926573 |       4 |  118011 | 2009-12-27 07:13:28
     26926573 |       4 |  118011 | 2009-12-27 07:13:28
    A couple of interesting things here.
    - Two of them have identical duplicates (26709186 and 26926573). These
    can both be explained by an error in the planet file or import
    - The other two however are not the same and both of them must have
    been created during diff application because it happened 2 days ago -
    within 10 seconds of each other. It is possible that there were
    duplicates of these ways as well and for some reason they just didn't
    hit this error during diff application and one of the records was
    successfully updated.
    Soo... wtf? Does Does anyone have ideas about how postgres' primary
    key check could be circumvented? Is my theory about the \copy getting
    around it during import feasible? But what about the ones created
    during diff processing? Looking at my system monitoring I don't see
    anything unusual going on 2 days ago. I've been having problems with X
    on this machine but that won't affect postgres and osmosis is running
    inside of screen. Soo... yeah. Anything? :)
    Frederik Ramm | 28 Jan 10:49 2013

    Un-Redacting Stuff

        with the license change we introduced the concept of "redacted" 
    objects. Since "redacting" an old version touches that version in the 
    database, initially such redactions made Osmosis issue diffs that 
    contained that old version; we then introduced a quick fix to stop that:
    We're now also using "redaction" to suppress objects where a copyright 
    violation has occurred - but mistakes are possible, so we need to have a 
    way to un-redact things if necessary, i.e. remove the "redaction_id" 
    from a historic version again.
    Simply setting the column to NULL will, again, make Osmosis issue a diff 
    that contains the old version; this is unwanted.
    How could we proceed?
    1. Introduce special value "0" (not NULL) to denote an un-redacted 
    object; leave Osmosis unchanged (so it treats NULL and 0 differently, 
    will only issue .osc for objects with redaction_id=NULL), and modify 
    other API code to treat 0 and NULL the same (so historic versions can be 
    accessed through the API if redaction_id=NULL or 0). Cheap, easy, but a 
    bit ugly.
    2. Introduce an additional column "suppress_diff" to 
    nodes/ways/relations tables; on un-redaction, set redaction_id=NULL and 
    suppress_diff=TRUE; modify Osmosis by assing an "and not suppress_diff" 
    to the SQL query. Would increase database size by something like 4 GB 
    for the extra column.
    3. Introduce an additional table "un-redacted objects", store object 
    type, version, and id; when an object is un-redacted, add it to that 
    table and clear the object's redaction_id, then modify the Osmosis query 
    to only output objects that are not found in that table. Uses little 
    space but makes diff creation slower.
    There might be more...
    Frederik Ramm  ##  eMail frederik@...  ##  N49°00'09" E008°23'33"
    Daniel Kaneider | 24 Jan 21:47 2013

    pgsimple/pgsnapshot possible bug


    I did some import of OSM data into a Postgresql 9.2 DB using osmosis 0.41. The pgsnapshot_load script stopped since some function could not be found (Envelope,Collect). If I am not wrong then

    UPDATE ways SET bbox = (
        SELECT Envelope(Collect(geom))
        FROM nodes JOIN way_nodes ON way_nodes.node_id =
        WHERE way_nodes.way_id =

    should be changed to

    UPDATE ways SET bbox = (
        SELECT ST_Envelope(ST_Collect(geom))
        FROM nodes JOIN way_nodes ON way_nodes.node_id =
        WHERE way_nodes.way_id =

    This should apply also to the pg_simple script.

    Daniel Kaneider

    osmosis-dev mailing list
    Frederik Ramm | 23 Jan 23:02 2013

    Question regarding the replication file structure

        I'm toying with the idea of offering regionalised diffs - i.e. a 
    series of daily diffs for every regional extract that has to offer. To make it easy for consumers to 
    keep their extracts up to date, I thought about making an Osmosis-style 
    directory for each extract, e.g. something like 
    or so. Just to be safe: What are the conventions that I will have to 
    follow so that this works seamlessly with existing clients? Simply have 
    a xxx.osc.gz and matching xxx.state.txt in the leaf directory, count 
    from 000 to 999 then wrap to the next directory, and have the most 
    recent state.txt file at the root directory as well - anything else?
    If the frequency wasn't exactly daily - if, say, because of some sort of 
    glitch there was extract for one day and therefore the diff is missing, 
    or if there were two extracts in one day - would that matter?
    Frederik Ramm  ##  eMail frederik@...  ##  N49°00'09" E008°23'33"