dair at refnum.com
Fri Aug 3 13:43:26 BST 2007
Alex Mauer wrote:
>It's been nearly three weeks now since SOTM happened and STAGS was
>first described. I have yet to see any description of how STAGS
>might actually work in practice.
I'm a newcomer to OSM, so wasn't around for the initial
discussions regarding tagging. My day job is www.routebuddy.com,
where we use TeleAtlas data, so I found the "free form" OSM
model quite daunting.
I watched the STAGS presentation at SOTM and thought it was a
good step forward: although being able to tag anything and
everything makes it easier to capture data, actually doing
something (rendering, routing, etc) with OSM data is harder than
with TeleAtlas/Navteq data.
That's primarily because commercial data comes with a well
defined set of tags and values, which rarely change. Of course
the downside is that they either won't capture, or will capture
incorrectly, things that don't exactly match their categories.
I can't really offer any insight into STAGS, but the talk made
me think of the type system Apple use for
file/clipboard/stream/etc data (UTIs, or Uniform Type Identifiers):
UTIs provide a structure to relate values to one another, while
still allowing extensions to capture new types of value rather
than mis-casting them into an existing type.
The hierarchy aspect of it is very powerful. E.g., a KML file
could be defined as being a type of XML file. An XML file is
itself a type of text file, which is a sub-type of a file, etc.
An app like Google Earth could then recognise a KML file as
containing KML data, while an XML editor could also recognise it
as some kind of XML (and so be able to view it, if not edit it).
Similarly a simple text editor would know that even though it
doesn't know what KML or XML are, it knows that this type of
file is some kind of text file and so it can open it too.
I thought it'd be worth mentioning UTIs as they, or something
similar, may be a useful idea for STAGS. I think it would help
bridge these two needs, of being able to make up new tags to
capture new situations while adding some structure to make it
easier to use the data.
Plus being able to namespace tags, relate them to each other,
usefully process tags defined after the code was written, etc.
E.g., you know that a "zebra crossing" is some kind of
"pedestrian crossing" - and while you may not know what a
"wombat crossing" actually is, you can still identify it as some
kind of "pedestrian crossing" and by implication some kind of
"crossing" (without needing three tags on every way, identifying
it as a crossing, pedestrian crossing, and wombat crossing).
I'm very aware that he who writes the code makes the rules, and
IMO one of the reasons OSM is doing so well is that it's
ruthlessly pragmatic. So I'm not claiming that a UTI scheme
would solve every problem, or that it should be used to define
some hypothetical exhaustive heirarchy of types.
But it might be a useful model to look at, given that it tries
to solve a similar problem of helping apps make sensible
decisions about the open set of file types.
I thought I'd mention it as a voice from the commercial side, as
someone who writes code based on "the competition". :-)
dair at refnum.com http://www.deathvalleycycle.com/
More information about the talk