[OSM-dev] Visualising change (.osc) files

Thu Nov 27 18:50:30 GMT 2008

Wow,
Thanks for all the info Brett!
I have subscribed to itoworld, and it's extremely useful, however it would be useful both to see node changes and also to be able to visualise a wider area.

When you say "if you can wait for API 0.6" what do you mean, as I am assuming that Osmosis is able to deal with 0.6 format data now.

I had a go at populating a pgsql database with the new 0.6 schema, using version 0.29.4 of Osmosis. unfortunately the latest version zip linked from the wiki page is corrupted for some reason. I get an error as follows:
SEVERE: Thread for task 1-read-xml-0.6 fail
java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Unknown Source)

..... etc down to the calling of the run method of 0.6 Xml Reader;
at com.bretth.osmosis.core.xml.v0_6.XmlReader.run(XmlReader.java:109)

my command line is as follows:
 java -classpath "C:\Documents and Settings\Stevo\My Documents\Maps\osmosis-0.29.4\osmosis.jar";"C:\Program Files\Java\jdk1.6.0_07\bin\postgis_1.4.0SVN.jar";"C:\Program Files\Java\jdk1.6.0_07\bin\postgresql-8.3-604.jdbc4.jar" -Xmx1048m com.bretth.osmosis.core.Osmosis --read-xml-0.6 "C:\Documents and Settings\Stevo\My Documents\Maps\uk-081119.osm.bz2" --write-pgsql-0.6 host=***** database=****user=****

Not sure where that's coming from really.

Steve

________________________________
From: Brett Henderson <brett at bretth.com>
To: S Knox <roxyknox at yahoo.co.uk>
Cc: dev at openstreetmap..org
Sent: Tuesday, 25 November, 2008 21:50:48
Subject: Re: [OSM-dev] Visualising change (.osc) files

On Wed, Nov 26, 2008 at 6:26 AM, S Knox <roxyknox at yahoo.co.uk> wrote:

Hi List,

I am already on the talk list, but this is my first post here. Hope it makes some semblance of sense.

Having seen Osmdiff in action, I thought it would be useful to have a mechanism to see daily changes to the database over a wider area (e.g. a region or a country), and/or without quite so much downloading, processing and hard disk space usage* (not that it's not a great tool). This could use either the full daily changeset or the changeset filtered by a bounding box. Of course, there are currently no facilities to visualise this at the moment, as way nodes are not included in the diff file (understandably). So is there some way that modified ways could be cross-referenced with an existing Postgis database to then enable all changes to be uploaded to a separate Postgis database. I was thinking about the steps necessary to achieve this and came up with the following.

1) Any added nodes are added to the "Change" Postgis database - should be relatively easy as already have location in changeset
2) Any changed nodes are added to the "Change" Postgis database in a similar way
3) Any changed ways query the "Existing" Postgis database and get the list of nodes and their locations. The entire way is then added to the "Change" database.
4) Deleted nodes/ways - this could also query the database and get the old geographic information.
5) The output could then be visualised in Mapnik or similar.

Firstly these guys http://www.itoworld.com/static/osmmapper are already doing much of this.  You may want to check it out to see if it does what you want.  However if you do go down this path read on ...

There are two databases that already do some of this:
* The database produced by the osm2pgsql tool which supports current mapnik rendering.  It can operate in a slim mode which means it stores node and way information to allow changesets to be applied.  I suspect this would be fairly complicated to adapt to your needs but I'm not familiar with the tool.
* The 0.6 version of the osmosis pgsql schema.  Obviously not available yet but "real soon now" 0.6 will be deployed.  The pgsql schema is closely aligned with the main MySQL API schema, it contains entity tables (ie. node/way/relation) plus associated node_tags/way_tags, etc tables to store the related information.  If you can wait for 0.6 this might suit your needs.  More details below ...

The schema creation script is here:
http://svn.openstreetmap.org/applications/utils/osmosis/script/pgsql_simple_schema_0.6.sql

The 0.6 version introduces an action column against each entity.  For existing data this field will be set to "N" which means an action of None.  But when the entity is impacted by a changeset imported the field will become one of:
* "A" - Add
* "M" - Modify
* "D" - Delete

The database has an empty stored procedure called osmosisUpdate which is triggered whenever a changeset is applied.  This allows a tool using the pgsql schema to modify osmosisUpdate to perform some action based upon the changed data.  This might be just to copy the affected entity ids to a temporary table until a tool written in a higher level language can process them.

The sequence of events is this:
1. The pgsql database contains a snapshot of the planet at a point in time.  All entity actions are None.
2. Osmosis begins a database transaction.
3. Osmosis imports a changeset file and sets the action of each impacted entity to one of Add, Modify or Delete.
4. Osmosis calls the osmosisUpdate stored proc.
5. The osmosisUpdate stored proc performs some application specific functionality.  Note that it does nothing by default.
6. Osmosis sets the action of all entities to None.
7. Osmosis commits the transaction..

Okay, so for your purposes, perhaps the following could be done.  There may be other more effective ways of achieving the same thing.
For your purposes you could create an osmosis pgsql database and use osmosis to keep it fed with minute/hourly/daily diffs.  Then modify the osmosisUpdate stored proc to write to your own tables that capture the information you wish to render.  You could then write a script to extract those tables into an osm format file containing a representation of impacted ways.  Use osm2pgsql to import this into a mapnik rendering database, customise the mapnik stylesheets to understand your special tags, and voila.

One thing I still need to do is write some additional functionality into osmosis and the pgsql tasks to allow it to support a bounding box.  This would only write information into the pgsql database if the entities were within a specific bounding box.  The only way to do this at the moment is to import the complete changeset then write a custom query to remove the bits you don't want.  Changesets are relatively small though so if you import complete changesets your database won't grow too quickly and any cleanup can be performed on an adhoc basis manually. 

Is there something I haven't thought through, and what are the likely coding implications? I can tinker around with Perl, Python and Java, but when it comes to a large program like Osmosis I am a little lost to be honest.
Hopefully you won't need to customise osmosis.  But if you would like
to create a new task let me know what the inputs and the outputs of the task are and and I'll help you understand the app code layout.  Each task is fairly self contained and doesn't require
knowledge of the entire application.  It sounds like most of your work
will be generating data in a format suitable for mapnik rendering so a
custom script may be more appropriate for that.  Osmosis will be good
at identifying the data you need to map, your script could then turn it
into a set of nodes and ways for rendering. 

For info there was a thread some while back on how it was impossible to filter a changeset by bounding box, and what alterations might be necessary to Osmosis to allow this to happen:
http://www.nabble.com/Osmosis:-Bounding-polygon-does-not-support-change-data-as-input--td19546267.html

However, I think this is a distinct task.
Yeah, that discussion was around methods for extracting parts of a
changeset when you're not interested in the entire planet.  It should
be possible to work around that limitation as follows:
* Download a full planet.
* Extract the bbox you're interested in.
* Import the bbox into your database.
* From that point on import complete changesets which will mean you have data you don't need.
* Occasionally run a cleanup sql script to remove data outside the bbox if the database grows too much. 

Cheers

Steve

* Currently using Osmdiff to see a change in a small area involves downloading 2 planet files of interest, likely to be > 90MB compressed, and then run osmosis on both for the area in question, then run osmdiff. Maybe I'm being a download-a-phobe as I only have mobile broadband (thanks the crapness of BT and the inability of Ofcom to have a competitive fixed line market) but that seems like a lot of work to me.
You should never have to download a full planet more than once.  You
can use changesets to create a new planet based on an old one .  I haven't used osmdiff though so not
sure what the full workflow looks like.

Not sure if I've helped you at all but hopefully some of it is useful.

Brett

PS.  I really should document the 0.6 pgsql schema on the wiki at some point.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20081127/b923944e/attachment.html>