[OSM-dev] Questions on PostGIS

Wed Feb 6 12:25:07 GMT 2008

Hi All,

I was hoping to knock up a simple PostgreSQL schema (ie. just planet 
data, no history) for manipulating large data sets, primarily to help 
with extraction of bounding boxes.  I've written my own custom database 
within osmosis but I don't think I'll ever get it to scale as I hoped so 
a real database appears to be necessary.

I'm planning to use the PostGIS support but my knowledge is extremely 
lacking in this area.  I'm prepared to go away and do some more research 
but given that others already have experience with this they might have 
some ideas on how this should be done.  I haven't even looked at the way 
the mapnik database works so point me in that direction if its 
appropriate.  Just keep in mind that I'm trying to keep things fairly 
simple if possible.

So far I've created a node table as follows (tags are in a separate table):
CREATE TABLE node
(
  id bigint NOT NULL,
  user_name text NOT NULL,
  coordinate point NOT NULL,
  tstamp timestamp without time zone NOT NULL,
  CONSTRAINT pk_node PRIMARY KEY (id)
)
WITHOUT OIDS;
ALTER TABLE node OWNER TO postgres;

CREATE INDEX idx_node_location
  ON node
  USING gist
  (circle(coordinate, 0::double precision));

Does this look to be on the right track?  What is the right way of 
indexing point types?  Or should I be using different data types?  I'm a 
bit confused about the difference between geometry types and types such 
as point, circle, etc.
  More importantly, how do I query within a bounding box?

One other thing I noticed is that PostGIS has its own methods for adding 
geometry columns to tables.  Is this something I should be using?  Will 
this adversely affect the size of the on-disk data?

The next question is how do I support ways?  My current schema just has 
a way table and a way_node table which aligns with the way the 
production mysql db works.  Perhaps if I used geo-spatial types instead 
I could do smarter things with bounding box queries (ie. find ways that 
cross a box without any nodes residing within it).  My main concerns are 
that this may 1. drastically increase load times 2. increase data size.

The way tables in question are below (again way tags are elsewhere):

CREATE TABLE way
(
  id bigint NOT NULL,
  user_name text NOT NULL,
  tstamp timestamp without time zone NOT NULL,
  CONSTRAINT pk_way PRIMARY KEY (id)
)
WITHOUT OIDS;
ALTER TABLE way OWNER TO postgres;

CREATE TABLE way_node
(
  way_id bigint NOT NULL,
  node_id bigint NOT NULL,
  sequence_id smallint NOT NULL,
  CONSTRAINT pk_way_node PRIMARY KEY (way_id, sequence_id)
)
WITHOUT OIDS;
ALTER TABLE way_node OWNER TO postgres;

Sorry for the unstructured mess of thoughts above but any input would be 
appreciated,

Cheers,
Brett