[OSM-dev] Questions on PostGIS

Gabriel Ebner ge at gabrielebner.at
Wed Feb 6 14:07:59 GMT 2008

On Wed, Feb 06, 2008 at 11:25:07PM +1100, Brett Henderson wrote:
> I'm planning to use the PostGIS support but my knowledge is extremely 
> lacking in this area.  I'm prepared to go away and do some more research 
> but given that others already have experience with this they might have 
> some ideas on how this should be done.  I haven't even looked at the way 
> the mapnik database works so point me in that direction if its 
> appropriate.

If you mean osm2pgsql, it preprocesses the ways into polylines, so you won't
easily be able to get it out again as separate ways and nodes.

> So far I've created a node table as follows (tags are in a separate table):
> (
>   id bigint NOT NULL,
>   user_name text NOT NULL,
>   coordinate point NOT NULL,

This is postgres' internal point type[1]; if you want to use PostGIS, you'll
need the 'geometry' type.

[1] http://www.postgresql.org/docs/8.3/static/datatype-geometric.html#AEN5480

>   More importantly, how do I query within a bounding box?

select * from nodes where coordinate && '(16,48),(17,49)'::box;

[2] http://postgis.refractions.net/docs/ch04.html#id2699384

> One other thing I noticed is that PostGIS has its own methods for adding 
> geometry columns to tables.  Is this something I should be using?  Will 
> this adversely affect the size of the on-disk data?

This is just there to support the OpenGIS spec.  AddGeometryColumn is nothing
but a fancy wrapper around ALTER TABLE that will also add some checks to the
table and inserts a row in the geometry_columns table.

> The next question is how do I support ways?  My current schema just has 
> a way table and a way_node table which aligns with the way the 
> production mysql db works.  Perhaps if I used geo-spatial types instead 

I'd rather not use them (except for indexing) as you'll lose track of what
nodes you were referring to; which means you won't get the same data out that
you put in.

> I could do smarter things with bounding box queries (ie. find ways that 
> cross a box without any nodes residing within it).  My main concerns are 
> that this may 1. drastically increase load times 2. increase data size.

You'll probably want to store at least the bounding boxes of the ways.  (That
way you'll get O(log(n)) bbox queries)  As to 1) it will probably only pay off
if you have more than one query (same goes for the index on the nodes), and 2)
a box takes 4 doubles, i.e. 24 byte, which is less than the rest of a way
row taken together.

alter table way add bbox box;
update way set bbox = (select collect(coordinate)::box
  from node join way_node on way_node.node_id = node.id
  where way_node.way_id = way.id);



More information about the dev mailing list