[OSM-talk-be] Why I'm against a full automatic import (ook in het Nederlands) (Was CRAB Import Tool)

Glenn Plas glenn at byte-consult.be
Tue Oct 22 07:17:30 UTC 2013


English only since I've slept minus 3 hours. Sorry.

I can't help but aggree fully.   In analogy,  as we speak I'm trying to 
migrate data from a poi list of over 1000 poi's with my own version of 
this database using the customer foreign key to make this happen.  (for 
an API) I'm actually pretty much against this action but there are few 
alternatives since it was poorly planned by the customer.   I'm doing a 
full merge, so coordinates will come from mine, labels from the customer 
-They are actually bus stops-, public ones that are used by private 
transport in Antwerp, used by collective transporting services by BASF 
etc.).

This is only 1000 poi's and it's a merger's hell.  Arrival times differ 
between both version, even validity is offset.  Sometimes important 
stuff has been put in a comment field.  This makes me remember how hard 
it is to automate these things, I've done acts like that before, but 
here I am using an excell sheet(dump customer) and an sqlite database to 
construct this single table instead...

I just cannot create an algorithm of my common sense on how to go about 
each record.

The reason why I'm against such an action is because I cannot trust 
neither sources of the data to be consistent.  You might manage to 
import 30%, even 60% without problem, but it will be like the 80/20 
percent rule.  20% of the effort and time will be spent in importing 80% 
of the data, and vice versa, you'll spend 80% of your time on the 
remaining 20% of the data.   You might also end up spending weeks using 
overpass to correct OSM before import.  I've done a lot of those and I 
assure you, it's just crazy the kind of mistakes you find in OSM alone. 
addr:postcode=Zemst , addr:city=1980 first one that pops in my mind.  
I've done lots of corrections like that.   But that is just one of many 
idiotic things, honest mistakes and ignorance at work, all well meant 
efforts, with the best intentions.

So in the end, we'll need something like Ben proposed.  It's a lot more 
fun indeed and it gives ownership,  just perfect.

Glenn

On 2013-10-22 07:12, Marc Gemis wrote:
> Nederlands onderaan
>
> Allow me to explain why I'm against a full automatic import of the 
> Crab data, as proposed on this mailing list
>
> I understand that this is the fastest way to get the data into OSM and 
> ready for use by everybody.
> However, the data will then be owned by 1 or 2 people that did the 
> import. They will not be able to cope with the consequences of the 
> data they imported. The import software will have some flaws (double 
> addresses, missing buildings, bad buildings, problems with 
> associatedStreet merging, etc.)
> Will you clean up the mess that others made ?
>
> If, on the other hand, you allow people to import their own chunks of 
> data (via the tool made by the French, a lot of people "own" the data. 
> Every contributor takes some pride in the data s/he added and will be 
> glad to make corrections to it. Even during the initial import 
> improvements to the imported & existing data will be made. The more 
> people that do this, the better.
>
> It's all about community building. Build a community around this 
> import. This community will do other things as well afterwards.
>
> You can hear the same message in all presentations on import at the 
> SOTM US and SOTM conferences. Please take a look at those videos.
>
>
> ----- Nederlands---
> Sta me toe om uit te leggen waarom ik tegen een volledige 
> geautomatiseerde import van Crab data ben, zoals ergens voorgesteld werd.
>
> Ik begrijp dat sommigen de data snel in OSM willen krijgen, zodat het 
> door iedereen kan gebruikt worden. Het gevolg daarvan is dat de 
> gegevens door 1 of 2 mensen aangemaakt is. Zij kunnen niet alle 
> probleempjes oplossen die ontstaan door deze invoer. Ik denk hierbij 
> aan foutjes in de software die ervoor zorgen dat er dubbele adressen 
> zijn of problemen met de associatedStreet-relaties. Ook wordt er 
> tijdens de import ook niks gedaan aan ontbrekende of foutieve 
> gebouwen. Wie gaat die problemen aanpakken die door anderen gemaakt zijn ?
>
> Als je aan de andere kant, iedereen toelaat om stukjes gegevens te 
> importeren en onmiddellijk te verbeteren, krijg je een groep van 
> mensen die de gegevens bezit/beheert. Deze mensen gaan in zekere zin 
> fier zijn op hun werk en proberen de fouten eruit te halen. Hoe meer 
> van deze mensen hoe beter.
>
> Het gaat dus over het opbouwen van een community. Bouw aub een 
> community op rond deze import. Op langere termijn zal osm er wel bij 
> varen.
>
> I meen deze boodschap ook te horen in alle presentaties rond imports 
> die gegeven zijn op de SOTM US en SOTM conferenties. Kijk maar eens 
> naar die videos. (wel in het Engels)
>
> groeten
>
> m
>
>
> _______________________________________________
> Talk-be mailing list
> Talk-be at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-be

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-be/attachments/20131022/6b79ab7c/attachment.htm>


More information about the Talk-be mailing list