[Openstreetmap-dev] Re: OSM Schema Design
M Josenhans
m_josenhans at web.de
Sun Jan 22 22:17:42 GMT 2006
Hello,
Immanuel Scholz wrote:
> There is an XML Schema on the wiki which describes the format of the data
> objects when transfered. And there is a REST documentation describing the
> different ways of accessing the data.
I have seen that. It's a good start and it's good to be able to validate
the XML data generated by the server with the XML Schema.
The way XML schema are written, it is almost impossible to write them
without an XML schema editor. XML DTDs are better to read for human,
however they are less complete in defining the data structure.
It is my understanding that in mid term OSM street map data needs to be
structured differently than today, as there are many things, which have
not been covered yet.
Here are some examples:
rivers, lakes, bridges, house numbers, forests, street types (e.g.
motorways, country roads, city roads, bicycle roads), one way streets,
railways, restrictions (max vehicle speed, max vehicle height, max
vehicle weight), roundabouts, motorway drive-up, country borders,
information to support routing, railway stations, etc.
While it's hard to see what will be needed in the future, I can offer to
facilitate this discussion.
> Currently it is possible to access:
> - single node
> - list of nodes by id
> - single segment
> - list of segments by id
> - list of objects by range (lat/lon rectangle)
> - list of gps points by range
Yes. This API allows to perform many different usage scenarios. However,
it likely retrieves more data than actually needed and requires several
http-transactions to complete. Thus there is room for 'protocol'
optimization. Please correct me, if my assumption is wrong.
APIs and schemes optimized for the real usage scenarios can be optimized
better. Thus I'd suggest to identify the usage scenarios and start with
optimization on 'specification level' first.
>> Especially I'd like to separate the discussion on what is transfered
>> ('structure') from the discussion on how it is encoded ('encoding').
>> Hopefully protocol background will be useful here.
> I think profiling skills will help MUCH more. ;)
Sorry. Can't help here. ;-)
> The ruby server is surprisingly different from usual performance
patterns.
> Don't know what you mean by that either,
I mean that it is good to have an understanding about what shall be
transfered, without defining how it shall become encoded (XML, CVS,
JSON, ...).
> but it is the current plan to
> test implement a CSV output of the object schema (all XML stuff replaced
> with a simple CSV), because the server spent most of the time
encoding the
> XML.
Usually decoding is the bigger bottleneck, however I guess this is
handled by the Web-client. :-)
By the the way: unlike XML Schemes, which not only define the data
structure, but also how the data is encoded, ASN.1 schemes just define
how the data is structured and keep the data encoding separated to an
appropriate encoder.
> However, nobody came up yet with an actual CSV structure definition,
so if
> you want to do that, be welcome.
With CSV, I would doubt that it is usefully capable to store complex
structured data such as map data.
I'd guess that an encoder like JSON might be an improvement against XML:
http://en.wikipedia.org/wiki/JSON
If you know the exact structure of the data on both sides, its even
possible to skip the redundant type information. E.g. why transfer
'OSM', it shall be anyway there. This is the way how ASN.1 deals with
encoding.
Usually the ASN.1 encoding types (BER and PER) are about 100 times more
efficient in decoding than XML encoding. However BER and PER encoding
generates binary data and there are surely no ASN.1 tools that support
ruby and java script. Looks like it it might be useful to invent and
efficient ASCII encoder.
> If you want to profile the ruby server (or doing stress tests in other
> ways), I can give you directions on how to set up a test server on your
> local machine (if you are running linux).
I am running Debian. Thanks for the offer. Currently I probably lack the
time to detailed profiling.
Br,
Michael
More information about the dev
mailing list