[OSM-dev] Is there a way to use simple schema without hstore
andreaskalsch at gmx.de
Fri Nov 19 08:37:33 GMT 2010
thanks for your elaborate answer! Now I am up to date. Some ideas regarding my use case ...
Am 18.11.10 23:50, schrieb Brett Henderson:
> Hi Andreas,
> The change was made mostly for performance reasons. With a full planet imported into the
> database, bounding box style queries are now approximately 10 times faster. This is due to a
> couple of reasons:
> * All data (with the exception of relations) is now clustered by geographical location. This
> drastically improves performance where data is being processed for a limited area.
> * The nodes and ways tables are the only tables that have a geometry column, thus other data
> must be embedded in those tables in order to make use of clustering.
My concept is always to use _1_ table for all geometries and to create extractions when I need them.
Because a geom column can store any type, so it is a more unifying concept.
> I don't understand your comment regarding NoSQL. The main change is that now you will have to
> deal with a more complex hstore column type on the nodes/ways tables, but otherwise the same data
> still exists and can still be manipulated with SQL statements. The data is less relational that
> it was previously, but tag data is not terribly useful without access to parent entities so
> grouping them together shouldn't result in loss of functionality.
> You can still populate separate tags tables if you wish by running your own separate query to pull
> the hstore column apart.
This is what I need to do sooner or later, when I will update. It's important for me to use a
separate table for tags, because I run a script that will correct the tags of relations (from outer
ways to relations), and I don't want to rewrite this and other scripts that depend on this schema
every time the version changes.
Running an extra script that fetches the hstore tags and puts them into a separate table will add
the time that PBF gave me ;(
My main concern is that with the next big schema update I _have_ to patch the schema. On the long
run it is great to be conservative about such changes or *)
> If you're applying diffs to the database you can enhance the osmosisUpdate() function (initially
> empty, but can be customised) to keep your separate tags tables up to date during each diff
> application. You will need to run the "pgsql_simple_schema_0.6_action.sql" script against the
> database so that all actions during a diff are logged and can be used by your osmosisUpdate
> function to know which records need to be re-processed.
Is it possible to truncate the actions table for myself so that a separate script can access the
This is another important point. In the moment, I manually populate my own "current_features" tables
after an update that are populated with all features, whose tstamp is >= the time of the last
update. A little overhead ... I see that this table exists in 0.36 as well, so I could use it, if I
can truncate it manually?
> The older Osmosis 0.36 is still available so you don't have to upgrade. It remains compatible
> with 0.6 XML files. Finally, if there is enough demand for the older schema style the old tasks
> can be pulled back out of SVN and run alongside the new ones, but I'm not keen to do that without
> good reason. I did consider trying to support both styles of table in the same tasks by
> dynamically detecting what tables are installed, but it increases the code complexity considerably
> and I didn't think the effort was worthwhile.
*) With that, you would provide a downward compatible solution that I would appreciate a lot!
Is it necessary that Osmosis makes the schema checks? What about giving each schema a unique ID and
then let the user point Osmosis to this ID and let it fail, if the user has installed the wrong schema?
> Finally, I didn't make the change without careful consideration. I do try to keep schemas stable,
> and when they do change I provide an upgrade script to allow migration between them. But the
> performance gains achieved through use of hstore were too great to ignore. Retrieving heavily
> populated 1x1 degree areas from a database containing a full planet used to take approximately 1
> hour, but this is now down to well under 10 minutes.
On the long run, this is an argument ;) I am critical, because I still haven't thought through all
dependant scripts that do something with tags. But there are many ...
> Hope that helps,
> On Thu, Nov 18, 2010 at 8:18 PM, Andreas Kalsch <andreaskalsch at gmx.de
> <mailto:andreaskalsch at gmx.de>> wrote:
> Is there a way to use simple schema in Osmosis without hstore? And why was this changed? A
> separate table for tags can more easily be indexed. I think it is not a good idea to use
> hstore because then we can drop SQL, use NoSQL for storing data and use PostGIS/Postgres for
> Geometry only.
> What do you think?
> dev mailing list
> dev at openstreetmap.org <mailto:dev at openstreetmap.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dev