[OSM-dev] Osmosis and Postgresql
Jeffrey Warren
warren at mit.edu
Mon Jun 22 21:46:29 BST 2009
Hi, I'm compiling a very long description of a Postgres/Rails port schema
import, though I'm writing directly to the db. I'd be happy to share/post my
notes on the process as I managed to get around a number of undocumented
problems.
I'm still not done right now (i've been pinging the list here occasionally)
and my latest problem is that I'm getting several errors like the following:
ERROR: insert or update on table "current_way_nodes" violates foreign key
constraint "current_way_nodes_node_id_fkey"
DETAIL: Key (node_id)=(17704640) is not present in table "current_nodes".
********** Error **********
ERROR: insert or update on table "current_way_nodes" violates foreign key
constraint "current_way_nodes_node_id_fkey"
SQL state: 23503
Detail: Key (node_id)=(17704640) is not present in table "current_nodes".
These are current_way_nodes which were not deleted from the current_ table
when their matching current_nodes were deleted. I'm now trying to delete
them manually since this only occurs where nodes are visible=false, but it's
confusing when this causes the foreign key constraint creation to fail:
ALTER TABLE ONLY current_way_nodes
ADD CONSTRAINT current_way_nodes_node_id_fkey FOREIGN KEY (node_id)
REFERENCES current_nodes(id);
Has anyone else encountered this? Is there a better way to resolve it?
I tried following how the rails port actually deletes from current_nodes,
but was unable to find it; is this done outside rails code? Can someone
point to a line number where this is done so I can try to find out why the
dependent current_way_node record wasn't deleted also? Hopefully this will
eliminate this problem for future submitted changesets.
Jeff
======== history of a node with leftover current_way_node ==========
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="OpenStreetMap server">
<node id="17704640" lat="51.5465346" lon="-2.4116768" changeset="127793"
user="southglos" uid="3937" visible="true" timestamp="2006-10-06T11:08:17Z"
version="1">
<tag k="created_by" v="JOSM"/>
</node>
<node id="17704640" lat="51.5465346" lon="-2.4116768" changeset="601639"
user="Strange but untrue" uid="57932" visible="true"
timestamp="2009-04-16T22:28:59Z" version="2"/>
<node id="17704640" lat="51.5465346" lon="-2.4116768" changeset="601639"
user="Strange but untrue" uid="57932" visible="true"
timestamp="2009-04-16T22:31:23Z" version="3"/>
<node id="17704640" lat="51.5465346" lon="-2.4116768" changeset="601639"
user="Strange but untrue" uid="57932" visible="true"
timestamp="2009-04-16T22:32:07Z" version="4"/>
<node id="17704640" lat="51.5465346" lon="-2.4116768" changeset="601639"
user="Strange but untrue" uid="57932" visible="true"
timestamp="2009-04-16T22:32:19Z" version="5"/>
<node id="17704640" lat="51.5465346" lon="-2.4116768" changeset="601639"
user="Strange but untrue" uid="57932" visible="false"
timestamp="2009-04-16T22:32:21Z" version="6"/>
</osm>
On Mon, Jun 22, 2009 at 4:09 PM, 80n <80n80n at gmail.com> wrote:
> On Mon, Jun 22, 2009 at 2:29 AM, Brett Henderson <brett at bretth.com> wrote:
>
>> Graham Jones wrote:
>> > Brett,
>> >
>> > I'll have a look at --write-pgsql-dump and add some words about that
>> > too. Then I'll have a go at MySQL...
>> >
>> > The other thing I would like to add is a write-up on choosing a
>> > database to use, because it is not obvious to me which would be the
>> > best, but I suspect this has been looked at a lot before now. I think
>> > the choices are:
>> I'm not aware of a page summarising this, so it sounds very useful.
>> Lots of people ask about it.
>> >
>> > * PostgreSQL/PostGIS - Can handle big datasets, and has geographic
>> > extensions. What I do not know is whether these extensions are
>> > fast or not, or if for simple things like selecting for a
>> > bounding box it would be quicker to just select on lat/lon
>> > directly. Mapnik uses this database, but the schema is
>> > optimised for rendering.
>> >
>> There's actually 3 PostgreSQL schemas which causes much confusion:
>> 1. The API database. This is the database behind the main OSM
>> ruby-based API. This is supported by the osmosis "apidb" tasks. The
>> only db capable of holding full history.
>> 2. The Mapnik database. Used by the mapnik renderer. Loaded using
>> osm2pgsql. Not supported by osmosis. Uses PostGIS extensions. Holds a
>> filtered subset of data appropriate for rendering.
>> 3. The pgsql "simple" schema. Badly named, but this is a schema I
>> created for holding OSM snapshots (ie. doesn't maintain history). It
>> does support replication for keeping it up to date. Uses PostGIS
>> extensions. Used by ROMA servers.
>>
>> All three schemas above can support minutely updates, 1 and 3 by
>> osmosis, 2 by osm2pgsql.
>> >
>> > *
>> >
>> >
>> > * MySQL - Can handle big datasets, but does not have geographic
>> > estensions.
>> >
>> This is the same schema as the PostgreSQL API schema, but implemented in
>> MySQL. Provides identical features, but no longer the preferred
>> database of choice. It's being maintained in the short term, but I
>> suspect it will get dropped over time if PostgreSQL continues to work
>> well.
>> >
>> > * Sqlite - I have seen a discussion about this and I think there
>> > are doubts over how well it works for big datasets?
>> >
>> No idea on this one, but yes I'd be dubious about its use for large
>> databases. Not supported by osmosis.
>>
>> On this note, I've also messed about with the Berkeley DB Java Edition
>> which didn't work well on large datasets. I've long since deleted the
>> osmosis tasks that used it.
>> >
>> > * GT.M - I know nothing about this, but the xapi servers seem to
>> > use it?
>> >
>> I don't know anything about this either.
>>
>
> GT.M is a high performance schemaless datastore. A bit like CouchDB.
>
> Probably not what you are looking for.
>
> 80n
>
>
>
>> >
>> > *
>> >
>> >
>> > I would also like to include some example SQL queries to deal with
>> > selecting nodes / ways etc. in a given bounding box, or searching for
>> > points of interest. These are not too difficult to do, but I suspect
>> > that some ways are more efficient than others, so it would be nice to
>> > include 'optimum' examples.
>> Sounds reasonable. You may wish to check out the osmosis "pgsql"
>> queries in this space, it uses true geo-spatial queries if the optional
>> way "linestring" and "bbox" columns are available.
>> >
>> > Do you know if this has already been done and written up somewhere?
>> > If not I will set up a template and ask for help filling it in!
>> I haven't seen it before (not that I've looked ...). Sounds great to me!
>>
>> A couple of other db mechanisms are available, details are on the wiki.
>> TRAPI - Stores data using a tile-based mechanism.
>> http://wiki.openstreetmap.org/wiki/TRAPI
>> osmbin - A compact custom storage mechanism. Supported by osmosis via a
>> plugin.
>>
>> http://wiki.openstreetmap.org/wiki/Osmosis/DetailedUsage#--write-osmbin-0.6
>>
>> Brett
>>
>>
>> _______________________________________________
>> dev mailing list
>> dev at openstreetmap.org
>> http://lists.openstreetmap.org/listinfo/dev
>>
>
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090622/f2a37644/attachment.html>
More information about the dev
mailing list