[Openstreetmap-dev] Re: OSM Schema Design

Mon Jan 23 09:30:01 GMT 2006

Hi,

> I have seen that. It's a good start and it's good to be able to validate
> the XML data generated by the server with the XML Schema.

Currently there does not exist an XML Schema in form of a schema file.
Everytime I say "XML Schema" I mean the roughly verbal definition at the
wiki.

Since we already have performance problems within the XML code of the
server, I would strongly prefere not to add an automatic XML validation
until some profiling measurement is set up to show that this is not an
impact to performance too (I don't trust Ruby's XML code anymore ;-)

> It is my understanding that in mid term OSM street map data needs to be
> structured differently than today, as there are many things, which have
> not been covered yet.

Maybe, maybe not. I believe the current data structure is powerful enough
to handle all needs an open STREET map could have.

> rivers,

Property "class=river" on a street could be a way to do this.

> lakes,

Property "class=lake" on an area.

> bridges,

"class=bridge" on a line segment

> house numbers

"house_number=xx" on a node

> forests

"class=forest" on an area

> street types (e.g. motorways, country roads, city roads, bicycle roads),
> one way streets,

All these are properties on a street.

> railways,

Property on a street

> restrictions (max vehicle speed, max vehicle height, max
> vehicle weight),

Either property on a street or on a line segment

> roundabouts,

Property on a node for small roundabouts or on several line segments for
more complex ones. Maybe on a street which contain exactly all line
segments which participate on the roundabout.

> motorway drive-up,

Property on a line segment, node or street - depending how complex the
drive up is.

> country borders,

Although I disbelieve this should be in an streetmap database, if you want
to enter it, make it a property on a street surrounding the country.

> information to support routing,

Property on the object you want to give hint for.

> railway stations, etc.

Property on a node.

I hope you see my point. I strongly disagree to make the data structure
more complex than necessary if not given a good argument.

Maybe there will be reasons for not expressing something as properties but
include it into the data structure. Please argue why you think a change of
the data structure is necessary for the examples above.

> While it's hard to see what will be needed in the future, I can offer to
> facilitate this discussion.

There are several wiki pages around that discusses what keys with what
values should be used to characterize the different things you mentioned
above. Maybe helping there is what you want?

>  > - single node
>  > - list of nodes by id
>  > - single segment
>  > - list of segments by id
>  > - list of objects by range (lat/lon rectangle)
>  > - list of gps points by range
>
> Yes. This API allows to perform many different usage scenarios. However,
> it likely retrieves more data than actually needed and requires several
> http-transactions to complete. Thus there is room for 'protocol'
> optimization.

I see room for simplifying the API. For example I don't see the need to
get single objects by id, when they already come fully described out of
the map - request.

Steve is currently simplifying the database and I am sure has some ideas
for changes to 0.3 API too.. ;-)

> APIs and schemes optimized for the real usage scenarios can be optimized
> better. Thus I'd suggest to identify the usage scenarios and start with
> optimization on 'specification level' first.

How many successfull Open Source projects you have seen which are done
after the waterfall model and start with a pure specification phase? I
think the "code after demand" approach is far better here.

If you have a specific request for a specific usage scenario, specify your
need. Then find a coder for it or code it yourself. If it is cool and
would simplify another usage scenario, speak with the coder of these
scenario and maybe he changes his protocol to support your interface as
well...

As example, if you planning to provide a good data cache for the OSM
server, you probably want more raw access to the database. Try figure out
what data access you want for the server, have a look at the database how
it is structured (and maybe how it could be improved to server your
needs), then write some ruby to access it (or find someone writing it for
you). And you are done.

But please do not try to foresee all possible usages of all possible
applications that could come and try to find an access method to serve
them all.

>> but it is the current plan to
>> test implement a CSV output of the object schema (all XML
>> stuff replaced
>> with a simple CSV), because the server spent most of the time encoding
>> the XML.
> Usually decoding is the bigger bottleneck, however I guess this is
> handled by the Web-client. :-)

It is not. It is done in the applet (for editing) and in the server (for
receiving uploads and creating the tiles in non-editing mode).

However, even the XML encoding is too slow.

> By the the way: unlike XML Schemes, which not only define the data
> structure, but also how the data is encoded, ASN.1 schemes just define
> how the data is structured and keep the data encoding separated to an
> appropriate encoder.

I think XML was not choosen because XML Schemes has to be used. XML was
choosen because it is the first and simplest idea that worked. Evidence to
this is, that no XML scheme validation is present anywhere in the code
now.

If you know ASN.1 well and if you point some ruby coders to ASN.1
libraries and define a ASN.1 scheme on how data are transfered and if you
convince that coder that ASN.1 is better than, say, CSV or XML, then maybe
it get implemented and used as transport mechanism.

I never looked at ASN.1 more than I was forced to during study. To me it
looks weird, bloated and complex. I prefere simple solutions.

> With CSV, I would doubt that it is usefully capable to store complex
> structured data such as map data.

That may be the reason why nobody came up with a encoding specification yet.

> I'd guess that an encoder like JSON might be an improvement against XML:

Before using a more complex mechanism than XML, please do some profiling
(or find one doing it) whether this is faster than XML in ruby.

> and there are surely no ASN.1 tools that support ruby and java script.

This is bad. Now you have to argue to change to a different programming
language in addition to argue why ASN.1 is better than XML or CSV ;-)

> Looks like it it might be useful to invent and efficient ASCII encoder.

Already invented. It is called "comma seperated values". ;-D

Ciao, Imi.