[Tagging] football or soccer ?
slhope at gmail.com
Fri Jul 2 02:31:18 BST 2010
On 1 July 2010 22:08, John F. Eldredge <john at jfeldredge.com> wrote:
> In fact, the technique of having the user select from a list of words, but actually storing the value as an arbitrary ID (generally numeric), is the recommended technique in database design. It is called "normalizing the database". Having the linking value be an ID value means that, should you want to change the verbal description of the value, for example from soccer to association_football, you only have to change the value once, in the lookup table, rather than changing it in thousands of places.
Actually, normalising a database actually refers to something a little
different than this, and as soon as you get out of theory into real
world database design you find that full normalisation is not always
the best way to do something - you can over-normalise a database. You
need to make choices depending on your platform, and how it's used.
So lets look at the effects if we made this change, and decide if we'd
like the effects.
- We can easily change the description of tags, without having to
change the main database. Good, for the reasons above. However, this
also means that we can easily corrupt tags - change their meaning from
what it was when the tag was used. This would be much easier to do
than now, where you have to go do bulk updates. Switch a couple of
descriptions around, you could have people entering creeks as paths,
brothels as churches (or vice versa), or whatever other mischief you
had in mind. And it wouldn't have to be malicious, it could be
- Free tag editing would become more dangerous (maybe extinct?) If
somebody tags a way as highway=primery, we can figure what they
(probably) meant. But if they tag it as 17364 instead of 17634, we
have no idea what they actually meant. And if both numbers are in the
list, it may be a very hard error to spot. Wether the removal of free
tag entry is a good or bad thing would depend on who you ask.
- There would be a two step process to using a new tag - create the
tag first, then use it. The editors could streamline this, but making
it obvious to the user they are creating a new tag may be a good idea.
- Having a table of used tags would allow us to add extra metadata to
the tags in one central place. We could have translations and
localisations, multiple categorisations for a single tag (eg tag X
could be under natural, leisure, and landcover), and other options.
- There is currently no differentiation between keys that expect free
text (name=, note= etc) and keys that expect one of a limited number
of options (highway=) We'd still need the free text option for some
tags, but would we need to cut it off entirely for others?
More information about the Tagging