[OSM-dev] segment discussion - fork of mysql partitioning

Thu Aug 31 13:04:41 BST 2006

I would love to see a strict separation between physical and logical
properties, segments representing the former and ways the latter.

I believe that every segment should contain information about the
underlying structure, just about everything that's visible to the plain
eye when visiting that place. Things like:

- road width (or physical classification)
- lanes
- oneway
- tunnel, bridge, embankment
- elevation (for map rendering, "layers")
- bicycle lane
- bus, taxi lane
- paving (tarmac, concrete, gravel)
- street lighting
- ...

It's still possible to reduce that data overhead by grouping segments with
the same properties server-side, assigning an id and store that
information only once. The osm xml output would then cointain a tag like

<properties id="12345">
<tag k="blah" v="a"> ...
<tag k="blah2" v="b">
</properties>

while each segment then just refers to that properties block. A recent
planet.osm (2006/07) contains 733131 "created_by JOSM"-tags, that equals
23MB of data and takes almost 16% of total space. Given the fact that
there are many roads being completely identical - apart from logical
properties, which are stored in ways - the overhead would not be that big
as the grouping is best done worldwide.  

Ways that only contain logical information as ref, name, bus route, etc.
make more sense to me. A way is best for a roadmap, and segments serve for
"topographical" maps.

I think editing a list of nodes is almost impossible in denser areas, and
having to give up segments would mean that a lot of short ways have to be
created. Doesn't that defeat the purpose of ways?

In my opinion it's easier for a realtime mapping application (on a
portable device) to deal with short units (segments) rather than with long
lists of nodes.

Whatever system is chosen, it has to be fast in two usage cases:
a) server-side storage and retrieval
b) rendering and route planning

The best / most logical scheme isn't of use if the server chokes on it and
is always slow as a result. But as soon as the server is not the
bottleneck, a scheme has to be found that is fast in rendering and route
planning. I don't really have a clue what might be best here, but a
portable device will most likely not be able to run a database server or
have much spare computing power (-> low battery life).

I imagine that automated data conversion - for example from a list of
nodes to segments - could cause problems for some usage cases. What about
a road that is physically one, but contains two or more different refs?
Getting that back together from two list of nodes need serious number
crunching (checking every new node with already drawn nodes). There surely
are more disadvantages.

Before a decision is made, we need benchmarks. Applications that use osm
data apart from editing. At least a concept of how that could be done
best. Don't we?

happy mapping,
Wollschaf