[OSM-dev] Area support
brett at bretth.com
Thu Jul 10 04:09:44 BST 2008
Stefan Keller wrote:
> Agreed. An area (or polygon) simply is a basic geometry type; it's a
> fact in practice and computational geometry.
> The actual XML encoding of ways never really was according to best
> practices and will not scale. This is because even for ways we have to
> read in all nodes until the end of the stream/file before any way can
> be created. Now, when we misuse (application oriented) relations to
> encode areas, processors are again forced to read in *all* relation
> instances before any area can be instantiated... Can't we do that
> better? (the crazyness would be complete when one would 'consequently'
> encode areas *and* ways as partially ordered relation instances which
> would point to nodes).
> I would take this discussion as a chance to enhance the OSM/XML
> encoding: Ways should be encoded with nodes (coordinates) embedded,
> and areas (polygons) ought to be encoded with one way as outer
> boundary and zero or more inner ways (boundaries) - embedded. I would
> even differentiate areas which overlap and areas which don't (but this
> is more on the conceptional and application modelling level and makes
> no difference in the encoding). Look on the simplicity and usability
> of such an XML encoding below...
A few comments:
Are you suggesting that we should represent areas as a set of ways?
Would this mean you should modify the xml structure below to include the
way id on each way element?
This wouldn't be advisable. It would require each set of ways to be
closed which would be hard to enforce because you'd end up having to
check if they're a member of an area. Perhaps you didn't mean this
though and way was just a convenient xml element for grouping nodes
within the xml.
I'm curious what value the geometry xml element adds but this is just
semantics so I'm not too fussed.
I've been thinking about the inclusion of node lat/lon information in
the file. Initially I thought it was a good idea but on further thought
I'm becoming convinced that it's not the way to go.
Greatly simplifies stream processing of large osm files avoiding
temporary (memory or disk-based) storage.
It adds redundancy to the file.
It increases the size of planet dumps.
Post-processing can achieve the same result and can be done efficiently
with an intermediate database using changesets.
It will add significant overhead to the existing planetdump and osmosis
It does have some immediate advantages but at the end of the day it is
an optimisation and I believe there are other perhaps more effective
ways of achieving the same result. If we can move towards the use of
databases and changesets instead of complete dump files we will scale
more effectively. In other words, the reason we have this issue is
because the data is growing too large to hold node lat/lon information
in memory and we want the main database to do the area-way-node
correlation for us. I'm suggesting that if we use a database locally
this problem goes away and we gain the added advantage of being able to
work with changesets instead of download complete snapshots every time
we need an update. We should be minimising the load on the primary
database and moving non-essential (ie. non-editing) work offline.
Hope that doesn't come across too negative. It's definitely worth
having this discussion, there may be things I haven't considered.
> Example of a enhanced OSM data encoding of an area and its boundaries
> (ways) as proper property types:
> <area id="4304746" timestamp="2008-03-25T21:31:01+00:00"
> <tag k="landuse" v="water"/>
> <tag k="created_by" v="JOSM"/>
> <tag k="natural" v="water"/>
> <tag k="name" v="Lake of Zuerich"/>
> <node lat="47.23439" lon="8.82187"</node>
> <node lat="47.23411" lon="8.82362"</node>
> <node lat="47.23411" lon="8.82111"</node>
> <node lat="47.23499" lon="8.82199"</node>
More information about the dev