[OSM-dev] A proposal for linking multiple ways into a virtual polygon

Jon Burgess jburgess777 at googlemail.com
Thu May 17 00:11:48 BST 2007

On Wed, 2007-05-16 at 23:26 +0100, 80n wrote:
> On 5/16/07, Jon Burgess <jburgess777 at googlemail.com> wrote:
>         On Wed, 2007-05-16 at 16:35 +0100, 80n wrote:
>         > Jon
>         >
>         > On 5/16/07, Jon Burgess <jburgess777 at googlemail.com> wrote:
>         >         On 16/05/07, 80n < 80n80n at gmail.com> wrote:
>         >         > The node Id of the last node in the way will
>         always be the
>         >         same as the node
>         >         > Id for the first node in the next way.  There is
>         not any 
>         >         real need to add
>         >         > linkage tags, the information is already present.
>         >         >
>         >         > The only exception would be if the area was
>         connected to
>         >         some other way as 
>         >         > well.  This would be fairly rare I think.
>         >         >
>         >
>         >         As I see it, providing redundant data is great for
>         error
>         >         detection and
>         >         recovery. See my earlier comments about being able
>         to report 
>         >         exactly
>         >         which way ID seems to have been deleted.
>         >
>         >         It also allows any tool to iterate a coastline or
>         other
>         >         polygon in
>         >         either direction. A single linked list would require
>         much more 
>         >         effort
>         >         in this case.
>         >
>         >         Not providing any information about the links
>         requires any
>         >         tool which
>         >         wants to re-join ways to do a non-trivial amount of
>         work to 
>         >         infer the
>         >         adjoining ways. See my earlier arguments.
>         >
>         > Surely, finding the next way that is attached to the end
>         node of the
>         > current way is trivial?  What difficulty are you
>         contemplating? 
>         >
>         The amount of CPU time needed to process a planet.osm file in
>         one go.
>         > Select the to node of the last segment in the current way.
>         - One node selected. Easy and quick.
>         > Select all segments that have this node as its from node. 
>         - Iterate 10M segments. Easy, but not quick.
>         > Select all ways that use this segment.
>         - Iterate 600k ways. Easy, but not quick.
>         > If there is more than one way then error, otherwise you have
>         your next 
>         > way.
>         >
>         - We'd expect exactly 2 ways if we include the one we are
>         trying to
>         join.
> Not if you select only segments that have the nodeID as the "From
> Node" - then you will only get one segment. 
>         - Repeat all the above steps 600k times.
>         > Is this hard to do with SQL or something? 
>         No. It is tricky to do in a CPU efficient way using the data
>         structures
>         which osm2pgsql currently uses to hold the planet.osm file
>         while
>         determining the geometries that it needs to write to the DB.
> Yeah, you'd need to have appropriate indexes - didn't realize that you
> don't have them at this point in the process.
> Can you pre-process to build a lookup table?  Alternatively, could you
> defer the way matching to later on after the db has been created (with
> the required indexes)? 
Yes. Sorry perhaps you misunderstood my aim for adding these pointers.

I was not trying to argue that it was impossible to do this processing.
It obvious is possible to work this stuff out.

My point was that it would surely be better if the relationships
describing that the a given set of ways formed a polygon was stored
inside the data instead of having to do any post-processing of the data
to make a best guess of the relationships.

Bitter experience has shown that if 2 programs need to interpret
incomplete requirements then they will develop different approaches. For
the majority of cases they will probably produce the equivalent results
(they are trying to fulfil the same broad aims) but for many edge cases
the results will differ significantly.

Just look at how the rendering of different things differs between
osmarender and mapnik. They are both notionally trying to render the
same data to produce a similar result. Key features like motorways and
primary roads exist in both, but both tools have there own different
solution to the rendering of the sea/coastline.

If we define that these tags are used to indicate that ways form a
polygon then we encourage convergence between the tools and simplify the
combined effort.

I'm sure the coastline rendering would be more consistent if we
developed something which was used by both mapnik and osmarender.

>         The osm2pgsql code does not perform any queries on the DB
>         while it is
>         generating the data.
>         Yes these things can be solved. The CPU effort to do so would
>         be
>         significantly reduce if the adjoing way IDs were stored in the
>         way key
>         values.
>         >
>         osm2pgsql does have the benefit of being able to rely on the
>         PostgreSQL 
>         geometrey queries which almost certainly can be used to do
>         this all
>         quite quickly, but my concern was that other tools may not
>         find it so
>         easy.
>         I don't want to put too much effort into an optimised solution
>         if it is 
>         quickly realised that the links should be added to the DB and
>         my
>         solution is a waste of effort.
>                 Jon
>         >

More information about the dev mailing list