[Talk-GB] No more voting on mechanical edits

SomeoneElse lists at atownsend.org.uk
Fri Dec 19 12:10:17 UTC 2014


On 18/12/2014 18:59, Rovastar wrote:
> Well please share the thoughts about what suggestions you have.
>

The big problem is not really whether a particular shop has an 
apostrophe in the name or not, but the fact that we don't have anything 
like all of said shops mapped.  I suggested that any plan for changes to 
the shop names and values that we have now would also need to address 
how new users decide which ones to use.

For iD, names are suggested via 
https://github.com/osmlab/name-suggestion-index/ , and 
https://github.com/osmlab/name-suggestion-index/blob/master/canonical.json 
is the "canonical list of known good ones".  I suggested to Matthijs 
that some sort of localisation might make sense there, since shop names 
do vary  (and thinking further about it shop functions do too - an 
Australian Woolworths is very different to what a UK Woolworths was).  
He was aware of name-suggestion-index but didn't seem to be aware of the 
canonical list.

Speaking entirely personally, I don't think that Matthijs suggesting 
that we add e.g. a missing apostrophe to a shop brand that is 
well-established as having one** is necessarily "wrong", it's just 
"almost entirely pointless" if we have so few of that shop brand mapped 
that the data isn't really useful.  Postprocessing data from large 
databases to make sense of it is something that you _always have to 
do_*.  It's not just OSM; any large dataset has this problem.   Try 
extracting data for railway stations as an example (seriously - try it - 
don't just write an email about it - actually try it, look at the 
exceptions and see what you get).  Is that preserved railway station a 
"station"?  What about the miniature railway in a park?  What actual 
features did $customer want when they were looking for a "station" 
anyway?  When OSM's data is more complete it might make more sense to 
say "right, now lets look at those exceptions" - but that has to be done 
on a case by case basis, you can't just assume that X is Y, because 
you've seen an X locally and have never been to the area where Y is.  
Having 10 people ticking a box on a wiki doesn't address that problem, a 
proper discusion does.

Following on from that, removing "wrong" data from OSM globally does 
cause one problem - it makes it much harder to see which areas have been 
inexpertly mapped.  If someone's got the spelling or a shop tag woefully 
wrong, what about their other edits?  That wrong tag might be the 
"canary in the coal mine" that indicates other problems that need a 
proper survey to investigate and fix.  Another similar issues is missing 
bridges over rivers and streams - adding a generic bridge might "fix" 
the problem on the QA site, but it takes away the pointer to an area 
that needs a survey (is there really a bridge, or a culvert under the 
road?).  That's why (despite the teeth-sucking on the #osm-gb list 
whenever it happens) I think that Matthijs' adding of OSM notes for 
these "miscategorised" shops is an excellent idea, though I wish that 
each note contained a link to the item in question.

What we seem to be forgetting in this discussion is that we're all 
supposed to be on the same side here, something that the name-calling 
(e.g. referring to someone as an "OSM dinosaur") and cheap 
points-scoring doesn't help with.  Many people in OSM regularly help 
other people with their pet projects.  For example, I've mapped more 
bits of Derwent Aqueduct infrastructure than any sane person could show 
a reasonable interest in (sorry Paul if you're reading) and I've also 
tried to help Matthijs get community acceptance for what he's trying to 
achieve here.  We have to work together, but in the case of mapping 
shops (the 90% that we don't know about yet), the main thing that you 
have to do is to _actually go out and map the shops_.  You can't do it 
from behind a computer keyboard.

Cheers,

Andy


* I've worked on statistical data extraction and combination from 
mechanical and digital systems on and off since the mid-1980s.

** Some brands do seem to use entirely consistent branding, some do not 
and some are in a process of change (as discussed at length on the 
previous thread).



More information about the Talk-GB mailing list