[Tagging] Data metamodels

pmailkeey . pmailkeey at googlemail.com
Sat Jun 6 21:19:25 UTC 2015

On 6 June 2015 at 15:36, Colin Smale <colin.smale at xs4all.nl> wrote:

>  On 2015-06-06 15:55, Daniel Koć wrote:
> W dniu 06.06.2015 11:29, Colin Smale napisał(a):
> Time to work towards an updated metamodel, with:
> * Multiple values (lists of values - sorting out the semicolon
> business?)
> Sure!
> * Complex values (data structures - formalising the namespace syntax?)
> Any example? I don't know what are you talking about.
> Take addresses as an example. An address is composed of a set of fields,
> like house number, postcode etc. These are mapped into the OSM tags
> "addr:street", "addr:postcode" and so on. You can consider an "address" to
> be a reusable definition, which can be used in many contexts. The current
> OSM syntax using the colon says that *this* use of "street" is in the role
> of a part of an "addr", and is semantically distinct from a "street" used
> as part of some other collection of values. All the data fields which are
> part of an "addr" are grouped together by the common prefix "addr:". But
> this usage of the colon to separate the namespace ID from the field is not
> actually part of the data metamodel. The key "addr:housenumber" is just a
> string and the colon is nothing special at the moment. It all hangs
> together with a sort of unwritten gentleman's agreement.

My thoughts on "address" is that it is 'one value' composed of several
components. Whether any component is numerical or alphanumerical is
probably insignificant - an address is a name (string) not a number. I see
little need for splitting 'address' into separate tags (addr:number etc. )
and would prefer to see a value for address having its components separated
by commas. The number of components should not be limited although it's
unlikely to come across any more than ten. It is probably logical to treat
the components as we do with numbers (and hopefully dates) by listing the
components in order of significance with the most significant first:
address=usa,20500,DC,Washington,Pennsylvania Avenue NW,1600,The White
House. As addresses are mostly connected with routing, they should be on
the routing map rather than the topographical map - i.e. that addresses
should be complete and not rely on being in any defined area (state,
county, zip,settlement,street) so there is no reliance on reference to
other data - unless address data can be reliably generated from the
physical location of an address by topo map reading - which actually would
be a better way of holding data in the db.

Current address components are too few - forcing components to share keys.
The above suggestion would eliminate this and allow for standardisation.

It would appear that allowing keys to have componented values would solve
many problems such as currently being discussed shop=camera,video,frames....

Semicolons are a peculiar choice of delimiter; commas are much more common
("CSV") and basing the component values being 'floating' (ideally most
important first), there will never be a double comma (,,) which allows such
to be used to generate a comma in a string.

> You're right. I also argue we need better category system, exactly because
> we loose a lot of energy for trying to put some real-life objects into too
> narrow and fixed categorization model.
> Time to take things to the next level!
> Any practical hints how to do it?
> This is where it gets problematic. Any attempt in this direction will
> necessarily restrict the freedom of mappers, by saying there is a right
> way and a wrong way to do something.

If that is true, is it a good thing or a bad thing ? It appears we're
already creating restrictions - and I don't think that can be helped if
there's going to be any hope of consistency at all. To the mapper, is the
category important ? It's a pub and I don't care what category it is in as
to me and many, it's not important. Even factual stuff is not important -
whether the pub is a building or a beer tent - what matters is the fact a
member of the public can expect to be able to purchase the typical things
found in pubs.

> The theoretical side of creating an information metamodel is the easy bit.
> Getting the community to buy in to
> something that will need support from every stakeholder in the OSM
> ecosystem is a challenge that is better picked up by someone else with more
> diplomacy and patience than me... It's part of what I do for a living, and
> I try to pick my battles carefully.

In fact, Colin, that is not the problem. The problem is OSM needs a MUCH
better decision-making engine. There is a lack of decisions - that is clear
from apparently dead proposals and it is clear that past decisions have
been made on incomplete data available at the time - such that things would
work better if they were changed. The other problem with OSM is getting to
the heart of it to progress anything - as surely, OSM must be the biggest
organisation based on the number of premises it uses taking into account
all the ad-hoc mappers!

There are times when 'tweaks' are great and there are times when a complete
fresh start is a better foundation. I believe OSM needs to do the latter
making use of all the known issues with the current OSM. On conclusion of a
NEW OSM, that will then need 'selling' to the existing OSM and an agreement
reached as to whether that transfer happens.

@millomweb <https://sites.google.com/site/millomweb/index/introduction> -
For all your info on Millom and South Copeland
via *the area's premier website - *

*currently unavailable due to the country's ongoing harassment of me, my
family, property & pets*

T&Cs <https://sites.google.com/site/pmailkeey/e-mail>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tagging/attachments/20150606/74f3044e/attachment-0001.html>

More information about the Tagging mailing list