[Tagging] man_made=works

Daniel Koć daniel at xn--ko-wla.pl
Mon Jun 1 13:50:07 UTC 2015

W dniu 01.06.2015 13:51, Martin Koppenhoefer napisał(a):
> 2015-06-01 11:51 GMT+02:00 Daniel Koć <daniel at koć.pl>:
>> You _have_ to decide whether it's a shop=travel_agency or
>> office=travel_agent.
> no, you don't have to, you can also use both tags if you think they
> should apply both.

But it's not just a simple case of a mapper not knowing what category 
key should be used. It's that we as a project don't know how to properly 
categorize this kind of object.

We could fix it that way for every not-so-clear-type object, but how 
much better is this double (or maybe more) tagging than one object 
belonging to two (or more) categories? From the ground truth perspective 
it is just a travel agency, not two different entities.

> having few keys and many values has a certain advantage in certain
> database systems, turning this the other way round will not let you
> gain much but will have negative implications for those systems. Plus
> you loose the generic "classes" that might be useful for finding tags
> or for filtering (e.g. "all shops in this area" is harder or
> impossible if the scheme goes: grocery=yes / shop).

I don't want to loose any of the current classes! They will be always 
needed, at least for legacy migration purposes. I just want to make them 
broader, more flexible context for the objects.

If we have more detailed tree of objects, we can still filter all the 
objects belonging to this category and subcategories. If you're really 
sure the grocery should always be taken as a shop, even if not described 
as grocery=shop (or grocery=yes+shop=yes) by the mapper, you can make 
any value (except grocery=no probably) to mean "it's a shop".

But what if you were wrong and there are for examples some places with 
free groceries to take? You're still sure the grocery=shop is a shop, 
but you may start using grocery=free as something else - and maybe treat 
all the other grocery=* values also as something different or just drop 
default "shop" category just in case, and wait for the mappers to make 
sure and tag it accordingly.

That's simple rule, I guess: if the mapper is sure that grocery=shop, 
you can trust her. But if she's not and refuses to decide, you can still 
enforce the type, just like currently - except it's explicitly known 
that your choice then, not the mapper's! As for now, we're silently 
enforcing the mapper to decide even if she's not sure; with categories 
not being compulsive, we have less "hidden" errors in the data itself, 
and all the responsibility for dealing with generalities and ambiguities 
is on the proper category tree and clients using it. But with more 
responsibility comes more flexibility. =}

> If we want to have consistent tagging and details, simple "presets"
> the way we have them now will not be sufficient. We should better have
> a guided scenario (aka "wizard") that asks the mapper some questions
> and in the end proposes the tags to set or states that there aren't
> yet tags for this kind of thing. This would also decourage people from
> using similar but not pertinent tags.

Great idea! This could also be a perfect loopback for us to know what 
people are trying to achieve, so we know what tagging scheme is missing.

"The train is always on time / The trick is to be ready to put your bags 
down" [A. Cohen]

More information about the Tagging mailing list