[OSM-dev] Is there a way to use simple schema without hstore

Brett Henderson brett at bretth.com
Thu Nov 18 22:50:38 GMT 2010


Hi Andreas,

The change was made mostly for performance reasons.  With a full planet
imported into the database, bounding box style queries are now approximately
10 times faster.  This is due to a couple of reasons:

   - All data (with the exception of relations) is now clustered by
   geographical location.  This drastically improves performance where data is
   being processed for a limited area.
   - The nodes and ways tables are the only tables that have a geometry
   column, thus other data must be embedded in those tables in order to make
   use of clustering.

I don't understand your comment regarding NoSQL.  The main change is that
now you will have to deal with a more complex hstore column type on the
nodes/ways tables, but otherwise the same data still exists and can still be
manipulated with SQL statements.  The data is less relational that it was
previously, but tag data is not terribly useful without access to parent
entities so grouping them together shouldn't result in loss of
functionality.

You can still populate separate tags tables if you wish by running your own
separate query to pull the hstore column apart.

If you're applying diffs to the database you can enhance the osmosisUpdate()
function (initially empty, but can be customised) to keep your separate tags
tables up to date during each diff application.  You will need to run the
"pgsql_simple_schema_0.6_action.sql" script against the database so that all
actions during a diff are logged and can be used by your osmosisUpdate
function to know which records need to be re-processed.

The older Osmosis 0.36 is still available so you don't have to upgrade.  It
remains compatible with 0.6 XML files.  Finally, if there is enough demand
for the older schema style the old tasks can be pulled back out of SVN and
run alongside the new ones, but I'm not keen to do that without good
reason.  I did consider trying to support both styles of table in the same
tasks by dynamically detecting what tables are installed, but it increases
the code complexity considerably and I didn't think the effort was
worthwhile.

Finally, I didn't make the change without careful consideration.  I do try
to keep schemas stable, and when they do change I provide an upgrade script
to allow migration between them.  But the performance gains achieved through
use of hstore were too great to ignore.  Retrieving heavily populated 1x1
degree areas from a database containing a full planet used to take
approximately 1 hour, but this is now down to well under 10 minutes.

Hope that helps,
Brett

On Thu, Nov 18, 2010 at 8:18 PM, Andreas Kalsch <andreaskalsch at gmx.de>wrote:

> Is there a way to use simple schema in Osmosis without hstore? And why was
> this changed? A separate table for tags can more easily be indexed. I think
> it is not a good idea to use hstore because then we can drop SQL, use NoSQL
> for storing data and use PostGIS/Postgres for Geometry only.
>
> What do you think?
> Best,
>
> Andi
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20101119/e5b465e9/attachment.html>


More information about the dev mailing list