[Openstreetmap-dev] Re: OSM Schema Design

Sun Jan 22 22:17:42 GMT 2006

Hello,

Immanuel Scholz wrote:
 > There is an XML Schema on the wiki which describes the format of the data
 > objects when transfered. And there is a REST documentation describing the
 > different ways of accessing the data.
I have seen that. It's a good start and it's good to be able to validate 
the XML data generated by the server with the XML Schema.

The way XML schema are written, it is almost impossible to write them 
without an XML schema editor. XML DTDs are better to read for human, 
however they are less complete in defining the data structure.

It is my understanding that in mid term OSM street map data needs to be 
structured differently than today, as there are many things, which have 
not been covered yet.

Here are some examples:
rivers, lakes, bridges, house numbers, forests, street types (e.g. 
motorways, country roads, city roads, bicycle roads), one way streets, 
railways,  restrictions (max vehicle speed, max vehicle height, max 
vehicle weight), roundabouts, motorway drive-up, country borders, 
information to support routing, railway stations, etc.

While it's hard to see what will be needed in the future, I can offer to 
facilitate this discussion.

 > Currently it is possible to access:
 > - single node
 > - list of nodes by id
 > - single segment
 > - list of segments by id
 > - list of objects by range (lat/lon rectangle)
 > - list of gps points by range

Yes. This API allows to perform many different usage scenarios. However, 
it likely retrieves more data than actually needed and requires several 
http-transactions to complete. Thus there is room for 'protocol' 
optimization. Please correct me, if my assumption is wrong.

APIs and schemes optimized for the real usage scenarios can be optimized 
better. Thus I'd suggest to identify the usage scenarios and start with 
optimization on 'specification level' first.

 >> Especially I'd like to separate the discussion on what is transfered
 >> ('structure') from the discussion on how it is encoded ('encoding').
 >> Hopefully protocol background will be useful here.

 > I think profiling skills will help MUCH more. ;)
Sorry. 	Can't help here. ;-)

 > The ruby server is surprisingly different from usual performance 
patterns.
 > Don't know what you mean by that either,

I mean that it is good to have an understanding about what shall be 
transfered, without defining how it shall become encoded (XML, CVS, 
JSON, ...).

 > but it is the current plan to
 > test implement a CSV output of the object schema (all XML stuff replaced
 > with a simple CSV), because the server spent most of the time 
encoding the
 > XML.
Usually decoding is the bigger bottleneck, however I guess this is 
handled by the Web-client. :-)

By the the way: unlike XML Schemes, which not only define the data 
structure, but also how the data is encoded, ASN.1 schemes just define 
how the data is structured and keep the data encoding separated to an 
appropriate encoder.

 > However, nobody came up yet with an actual CSV structure definition, 
so if
 > you want to do that, be welcome.

With CSV, I would doubt that it is usefully capable to store complex 
structured data such as map data.

I'd guess that an encoder like JSON might be an improvement against XML:
	http://en.wikipedia.org/wiki/JSON

If you know the exact structure of the data on both sides, its even 
possible to skip the redundant type information. E.g. why transfer 
'OSM', it shall be anyway there. This is the way how ASN.1 deals with 
encoding.

Usually the ASN.1 encoding types (BER and PER) are about 100 times more 
efficient in decoding than XML encoding. However BER and PER encoding 
generates binary data and there are surely no ASN.1 tools that support 
ruby and java script. Looks like it it might be useful to invent and 
efficient ASCII encoder.

 > If you want to profile the ruby server (or doing stress tests in other
 > ways), I can give you directions on how to set up a test server on your
 > local machine (if you are running linux).

I am running Debian. Thanks for the offer. Currently I probably lack the 
time to detailed profiling.

Br,
Michael