[Imports] Regione Veneto house numbers import

Davide Sandona' sandona.davide at gmail.com
Wed Nov 23 21:17:37 UTC 2016


Hello Frederick,
thank you for pointing out your concerns.

Is this mechanical edit part of your planned import as well?
>

This semi-automated edit is not part of the import, it's just an edit that
has to be done before the import, as explained in [1]. These edits will be
uploaded into a separate changeset.

It says that "each entry of the dictionary is manually checked to correct
> any association issues" - who will perform this manual check, and how will
> it be done?
>

The manual check is performed by the users of the scripts. At the moment I
am alone, but if any other members will join the effort they will use them!
The scripts creates a dictionary and save it to a CSV file. The users will
open the file, check each associations manually: a score field is also
present, the lower the score, the higher the probability of a mismatch. The
user will edit this CSV file. By looking at a mismatch the user will decide
if the association performed by the script is correct or need to be
changed. One example of this dictionary can be seen at [2] (this dictionary
has been manually corrected). Btw, if you look at the other files in that
repositories, you will see a detailed log of what is going on and how this
can help to improve OSM. At the end of the process, the community does know
exactly which street names are missing into OSM: it is then possible to go
out and search for them. It's also possible to detect TROLLS actions.

If OSM says the street name is A and the script says it looks like street B
> from the municipal data, who will actually go out and look at the street
> sign? Or will you automatically assume that the municipal data is always
> right?
>

The following assumptions has been made:

   - municipal data is correct. There is no reason to believe otherwise:
   this data has been used for years to administrate our municipalities!!! If
   you have reason to believe that the data is not correct, it's your duty to
   prove it! I've done everything I can do to prove the validity of the data:
   in my opinion the data is good and reliable!
   - OSM users are human: it is possible to make mistakes. In this specific
   case the mistakes are in the form of mistypes and small inaccuracies (you
   can see some examples in the wiki page of the import).
   - OSM users mapped what they saw. If the street signs contains
   abbreviations, the OSM mapper inserted it into OSM. If part of the name was
   not printed in the street sign, the mapper inserted into OSM only what he
   saw on the sign. But abbreviation is just a manner to make the entire name
   printable on the street sign! There is no reason to use abbreviations into
   OSM. Each abbreviation is a possible source of conflicts. The following are
   just a couple of examples:
      1. If you see a street sign reporting the name "Via U. Scarpelli",
      and some OSM user tells you "Via Ugo Scarpelli" and the official
      municipality data tells you "Via Uberto Scarpelli", which one do you
      believe to be true? I believe to the official sources, which provided a
      list of street names reporting why that street is called with that name!
      2. Consider a single municipality: given the fact that one unique
      name is used only on one street, if you see a street sign reporting "Via
      Garibaldi", which of the following possibilities do you think is
      appropriate for your specific case: "Via Giuseppe Garibaldi", "Via Anita
      Garibaldi", "Via Brigata Garibaldi". Looking at that street
sign, you will
      never know for sure. This is problematic because one
municipality could use
      all the three previous addresses. Think about an old street name marked
      onto a stone in the corner of a building. At that time there
were probably
      one street in the entire city containing the word "Garibaldi".
Do you think
      it's easy to change that name or stone.

The manual check will (virtually) eliminates all these conflicts: this is a
conseguence of what I'm doing, ie. the intersection of two different
datasets (the data from OSM and the data from the municipality). In the
rare occansion of conflicts, I'll have to write to the national talk list
looking for help!
The way you formulated your question assumes the street sign is always
correct and the OSM mapper always did a good job. As you have seen from the
above examples, that is not the case. There were no laws in the past for
italian municipalities to print the full street names. Things has changed:
all italian municipalities have to adhere to the ISTAT guidelines regarding
odonyms in order to reduce (virtually eliminate) the sources of conflict.
You can check the document linked in the Wiki page of this import for more
informations. This is enough to justify this effort alone (even if I
weren't going to do an import).
I see no reason to have such sources of conflict into OSM just because a
user mapped what he saw. This has been discussed on the national talk list,
and there was agreement.
Further, changing the name to the official one is going to save me (and all
the people that'll be involved in the process) at least hundreds of QA
hours after the housenumbers import has been completed.

As for the execution of the import, I would strongly suggest that you look
> for local mappers living in the cities involved and enlist their help in
> uploading data in their region, doing a sanity check by comparing things to
> their knowledge or aerial imagery. Under no circumstances should you
> personally, or any other single person, upload data for all the cities -
> this will mean that the person cannot take the necessary time to
> quality-check the data they are planning to upload.
>

I completely agree with you and I've already started the search for helping
members!

[1]
https://wiki.openstreetmap.org/wiki/Import/Catalogue/Veneto_House_Numbers_Import#Data_Preparation
[2]
https://github.com/Davide-sd/OSM/blob/master/Toponomastica/Vicenza/dict-high-topo.csv

Davide.

2016-11-23 19:53 GMT+01:00 Frederik Ramm <frederik at remote.org>:

> Dear Davide,
>
> On 11/23/2016 06:57 PM, Davide Sandona' wrote:
> > I will soon start a regional house numbers import.
>
> Your wiki page contains information about a mechanical edit that
> attempts to automatically align OSM street names with official data.
>
> Is this mechanical edit part of your planned import as well?
>
> It says that "each entry of the dictionary is manually checked to
> correct any association issues" - who will perform this manual check,
> and how will it be done? If OSM says the street name is A and the script
> says it looks like street B from the municipal data, who will actually
> go out and look at the street sign? Or will you automatically assume
> that the municipal data is always right?
>
> As for the execution of the import, I would strongly suggest that you
> look for local mappers living in the cities involved and enlist their
> help in uploading data in their region, doing a sanity check by
> comparing things to their knowledge or aerial imagery. Under no
> circumstances should you personally, or any other single person, upload
> data for all the cities - this will mean that the person cannot take the
> necessary time to quality-check the data they are planning to upload.
>
> Bye
> Frederik
>
> --
> Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"
>
> _______________________________________________
> Imports mailing list
> Imports at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/imports
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20161123/b06652df/attachment.html>


More information about the Imports mailing list