[OSM-dev] [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

Fri Feb 17 08:36:25 GMT 2012

Jochen Topf wrote:
> On Fri, Feb 17, 2012 at 07:44:41AM +0000, Jukka Rahkonen wrote:
>> Frederik Ramm <frederik <at> remote.org> writes:
>>
>> >
>> > Hi,
>> >
>> > On 02/16/2012 07:25 PM, Graham Jones wrote:
>> > > This reminded me of a question I have been meaning to ask for quite
>> a
>> > > while - is a database generated by osm2pgsql with an hstore expected
>> to
>> > > perform similarly to one without?
>> >
>> > I never ran one with hstore when I think of what this must mean for
>> the
>> > database engine, and storage space, then I shudder and would not be
>> > surprised by the factor 5 you mentioned.
>>
>> How about doing it with relations? Let the importer program create four
>> tables
>> osm_point
>> osm_line
>> osm_polygon
>> osm_relation
>>
>> These tables would each have two attributes: Geometry and osm_is.
>> Geometries in
>> the osm_relation would be of type "geometry collection", that is, a
>> collection
>> of whatever, and it could hold for example the tranport route relation
>> with the
>> route and bus stops and everything in one PostGIS geometry object.
>>
>> Then there should be one or four tables for tags (everything in one
>> table or
>> split to suit the geometry tables). The three attributes would be
>> osm_id, key,
>> and value. Osm_id would be used as a foreing key for joining geometries
>> and
>> attributes. If osm_id can not be guaranteed to be unique then there
>> should be
>> point_id, line_id, polygon_id and relation_id added to suitable places.
>>
>> Every attribute would be indexed. For Mapnik use where all that is
>> needed is to
>> do simple SQL selects.  I guess that this database would be faster to
>> query than
>> hstore even it also contains all the tags. It might be faster for Mapnik
>> than
>> the current tables with extra wide attribute schema because now
>> attributes are
>> not indexed at all (or are they?). SQL queries according to tags and
>> values or
>> even with a part of the tag and value strings would be easier to do than
>> from
>> hstore. People could enhance attributes for some special uses by
>> converting
>> max_speeds and other values which are actually measures from strings to
>> integers
>> or doubles etc.
>>
>> This option is so obvious that I believe that someone must have tried it
>> already. It would be nice to hear about experiences.
>
> Much too slow. Requiring joins on every query is not a good idea. And
> having
> the right indexes is important. You can't just index everything and hope
> it
> would do the right thing. (The most important index btw is the geometry
> index
> not the attribute indexes, although having the right attribute indexes
> here
> or there can help.)

Sorry, havin spatial index is so obvious that it did not even comen into
my mind to mention it. Do you have cold numbers about speed differences or
suggestions for good SQL queries to test with if I will get interested
enough to do a test some day? With this database structure I do not
foresee big difficulties with indexing, and joins with foreing keys is
vastly used in relational databases so perhaps it is not necessarily so
slow.

-Jukka Rahkonen-

> Jochen
> --
> Jochen Topf  jochen at remote.org  http://www.remote.org/jochen/
> +49-721-388298
>
>