[OSM-dev] Very long ways have been split (was: Status of Database Server after 0.4 Upgrade: Fragile)

Jon Burgess jburgess777 at googlemail.com
Sun May 13 23:19:14 BST 2007

On Sun, 2007-05-13 at 23:50 +0200, Frederik Ramm wrote:
> Hi,
> >> Are they going to render correctly if you've split the ways, so they
> >> don't form closed ways anymore?
> > 
> > More importantly (to me), why are we systematically destroying data to
> > fit into some arbitrary tool? These lakes are now no more recognizable
> > as polygon areas, but instead two linestrings. Great, now we can't pick
> > out polygons anymore automatically, instead we have to manually
> > postprocess the data.
> The API is not "some arbitrary tool". Even if it *could* handle ways of 
> any length (which it cannot) there would have to be some sort of cutoff 
> point; the whole Eurasian coastline may be a polygon, and it may be 
> "destroying" data to split it into several parts, but it is simply not 
> practical to have it as one way. (Whenever a download bounding box 
> contained a bit of coastline, you'd have to download hundreds of 
> megabytes of coastline!)

How about

Plan B:
- If a way is too large to return, don't return it!
- Instead perhaps we can add a new tag which says something like:

<way id=12345 segs_excluded="true" seg_count="5000" >
 <tag k=... > 
 [ more tags, but no segments ]

That way renderers or editors can be informed that a large way exists
even through the segments and nodes have not been downloaded

When faced with this situation, maybe a tool could fallback to using a
downloaded planet.osm dump to get information about the way?

Plan C:
Enhance the API to be able to upload/down a subset of the way segments,
e.g. something like...

<way id=12345 seg_index_start="1000" seg_index_end "1500" >
< seg id=...> 
[ Only the 500 seg's with index 1000 - 1500 of the complete way included
in download ]

The API can determine how many segments it want to return (e.g. maybe
those within 2 x BBox request).

The editor could freely edit,add & remove segments within this list
provided it kept the start/end indexes constant (allowing the API to
determine where to insert the uploaded data in the larger way).

> So it is obvious that we need a mechanism to deal with large areas 
> *without* having them in the data base as one single polygon. This is 
> not something under discussion, it is a fact.

Not true, they have existed in the DB for a while. Yes they have caused
issues, but removing them is not necessarily the only or best answer.

> The coastline display in tiles at home is solved; I don't know how Mapnik 
> deals with it but Mapnik cannot reasonably expect us to have one way for 
> the whole coastline of a continent. 

Mapnik does not expect t at h to do anything. It only worries about how it
renders data. On the contrary it seems you've done this without worrying
about how Mapnik will render the data.

I think you have just broken the rendering of all these ways in Mapnik.
Previously it had no issues with either ways of an arbitrary length or
bounding box (since it uses the panet.osm dump).

> Other areas do not yet render 
> properly in tiles at home if they're not a closed way, but a solution to 
> that is around the corner (with a modified close-areas.pl); I cannot 
> speak for Mapnik but I don't see a big computational problem in 
> reconstructing an area from joined ways if you need it.

Feel free to submit a patch for osm2pgsql.c as soon as you have it
implemented :-)

While it seems simple in theory, many things become non-trivial once you
are processing an entire planet.osm file.

> I could have split only things larger than 1000 segments or larger than 
> 1500 segments or whatever, but given we need to find a proper solution 
> for the problem anyway, why should we stretch the limit?
> I am able to undo the splitting of all or of selected items on the list 
> and restore areas to being closed ways, but I really don't see the point 
> - we cannot have closed areas for everything that logically is one 
> entity, so why fight over individual cases?

The concept of a closed way representing an area is defined as part of
the current node/segment/way model.


More information about the dev mailing list