[OSM-dev] Is there a way to use simple schema without hstore

Andreas Kalsch andreaskalsch at gmx.de
Fri Nov 19 08:37:33 GMT 2010


Hi Brett,

thanks for your elaborate answer! Now I am up to date. Some ideas regarding my use case ...

Am 18.11.10 23:50, schrieb Brett Henderson:
> Hi Andreas,
>
> The change was made mostly for performance reasons.  With a full planet imported into the 
> database, bounding box style queries are now approximately 10 times faster.  This is due to a 
> couple of reasons:
>
>     * All data (with the exception of relations) is now clustered by geographical location.  This
>       drastically improves performance where data is being processed for a limited area.
>     * The nodes and ways tables are the only tables that have a geometry column, thus other data
>       must be embedded in those tables in order to make use of clustering.
>
My concept is always to use _1_ table for all geometries and to create extractions when I need them. 
Because a geom column can store any type, so it is a more unifying concept.
> I don't understand your comment regarding NoSQL.  The main change is that now you will have to 
> deal with a more complex hstore column type on the nodes/ways tables, but otherwise the same data 
> still exists and can still be manipulated with SQL statements.  The data is less relational that 
> it was previously, but tag data is not terribly useful without access to parent entities so 
> grouping them together shouldn't result in loss of functionality.
>
> You can still populate separate tags tables if you wish by running your own separate query to pull 
> the hstore column apart.
This is what I need to do sooner or later, when I will update. It's important for me to use a 
separate table for tags, because I run a script that will correct the tags of relations (from outer 
ways to relations), and I don't want to rewrite this and other scripts that depend on this schema 
every time the version changes.
Running an extra script that fetches the hstore tags and puts them into a separate table will add 
the time that PBF gave me ;(
My main concern is that with the next big schema update I _have_ to patch the schema. On the long 
run it is great to be conservative about such changes or *)
>
> If you're applying diffs to the database you can enhance the osmosisUpdate() function (initially 
> empty, but can be customised) to keep your separate tags tables up to date during each diff 
> application.  You will need to run the "pgsql_simple_schema_0.6_action.sql" script against the 
> database so that all actions during a diff are logged and can be used by your osmosisUpdate 
> function to know which records need to be re-processed.
Is it possible to truncate the actions table for myself so that a separate script can access the 
changes?

This is another important point. In the moment, I manually populate my own "current_features" tables 
after an update that are populated with all features, whose tstamp is >= the time of the last 
update. A little overhead ... I see that this table exists in 0.36 as well, so I could use it, if I 
can truncate it manually?
> The older Osmosis 0.36 is still available so you don't have to upgrade.  It remains compatible 
> with 0.6 XML files.  Finally, if there is enough demand for the older schema style the old tasks 
> can be pulled back out of SVN and run alongside the new ones, but I'm not keen to do that without 
> good reason.  I did consider trying to support both styles of table in the same tasks by 
> dynamically detecting what tables are installed, but it increases the code complexity considerably 
> and I didn't think the effort was worthwhile.
*) With that, you would provide a downward compatible solution that I would appreciate a lot!

Is it necessary that Osmosis makes the schema checks? What about giving each schema a unique ID and 
then let the user point Osmosis to this ID and let it fail, if the user has installed the wrong schema?
>
> Finally, I didn't make the change without careful consideration.  I do try to keep schemas stable, 
> and when they do change I provide an upgrade script to allow migration between them.  But the 
> performance gains achieved through use of hstore were too great to ignore.  Retrieving heavily 
> populated 1x1 degree areas from a database containing a full planet used to take approximately 1 
> hour, but this is now down to well under 10 minutes.
On the long run, this is an argument ;) I am critical, because I still haven't thought through all 
dependant scripts that do something with tags. But there are many ...
>
> Hope that helps,
> Brett
>
> On Thu, Nov 18, 2010 at 8:18 PM, Andreas Kalsch <andreaskalsch at gmx.de 
> <mailto:andreaskalsch at gmx.de>> wrote:
>
>     Is there a way to use simple schema in Osmosis without hstore? And why was this changed? A
>     separate table for tags can more easily be indexed. I think it is not a good idea to use
>     hstore because then we can drop SQL, use NoSQL for storing data and use PostGIS/Postgres for
>     Geometry only.
>
>     What do you think?
>     Best,
>
>     Andi
>
>     _______________________________________________
>     dev mailing list
>     dev at openstreetmap.org <mailto:dev at openstreetmap.org>
>     http://lists.openstreetmap.org/listinfo/dev
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20101119/1cc73e8c/attachment-0001.html>


More information about the dev mailing list