[OSM-dev] Osm2pgsql outputs neg. and duplicate osm_ids (and weird attributes in table rels)

Jon Burgess jburgess777 at gmail.com
Sat Jul 23 21:25:52 BST 2011


On Sat, 2011-07-23 at 21:31 +0200, Stefan Keller wrote:
> Hi,
> 
> I have some questions about the osm2pgsql and it's output to the
> tables <<osm>>_line, <<osm>>_polygon and <<osm>>_rels (hereby called
> tables line, polygon and rels).
> 
> After loading OSM data there are three remarkable things in the
> resulting tables regarding osm_ids:
> 
> 1. In the tables lines and polygons there exist rows with negative osm_ids.
> 2. Tables lines and polygons contain rows with duplicate osm_ids and
> identical attributes except geometry.
> 
> I'd explain the first observation (negative osm_ids) by the following:
> The original (positive) id comes from a relation including its tags.
> So, given e.g. a multi-line or a multi-polygon relation which referrs
> to two ways it's clear that there is one (relation) id split into two
> rows in line/polygon table with the same (now negative) osm_id and
> with different geometries each.

That is an accurate description of how and why osm2pgsql generates these
rows with duplicate and negative IDs.

> Table polygon has duplicate osm_ids like lines table but only with
> negative ones. But lines table also contains positive osm_ids!
> 
> => Why this duplicate positive osm_ids in lines?

If you gave an ID as an example I could be more certain but I think
you're seeing another feature of osm2pgsql: If the line is very long
then it gets split every 100km (or 1 degree for the latlong output).
Search for "split_at" in the osm2pgsql source. This tries to limit the
bounding boxes of very long ways which might otherwise end up being
returned in lots of bbox() based queries. 

> In original OSM database there is no relation and only one way id (see
> e.g. http://www.openstreetmap.org/browse/way/47926394 and compared
> this to "select id, tags, length(way) from osm_line where id=47926394"
> which will return three rows).

Are you sure you are talking about the line table in an osm2pgsql
database? The column names seem strange and I don't see any rows for
this ID in a osm2pgsql planet database:

gis=> select count(*) from planet_osm_line where osm_id=47926394;
 count 
-------
     0
(1 row)

Since the way is untagged I would not expect it to be appearing in the
line table with its own ID although its geometry may appear as part of
one of its parent relations.


> 3. Then in table <<osm>>_rels which I don't understand following attributes:
> * way_off (smallint)
> * rel_off (smallint)
> * pending (boolean) -- exists also in table osm_ways
> * parts (integer[])
> * members (text[])

I don't remember the precise details but I believe that "parts" contains
all the node/way/relation IDs which are referenced by the relation. The
list starts with the node IDs, then at offset way_off in the list the
way IDs begin. Then from rel_off to the end are the relation IDs. 

The members list also has a list of the members with the type (n,w,r)
followed by the ID. The pending flag is used internally by osm2pgsql to
mark objects which have been effected by data changes and need to be
reprocessed in the later phases of the import process.

The internal details of the nodes/ways/rels tables are not really
intended to be used by anything other than osm2pgsql itself.

   Jon






More information about the dev mailing list