[Imports] Belgium address import
Kurt Roeckx
kurt at roeckx.be
Sat Nov 23 15:26:50 UTC 2013
On Sat, Nov 23, 2013 at 03:37:34PM +0100, Martin Raifer wrote:
> Hi Ben!
>
> As far as I can see, this import only covers the Flemish Region
> (Brussels excluded), not the whole country. So, there aren't any
> multilingual municipalities (like Brussels), are there?
Yes, it only covers the Flemish region. Brussels isn't part of
it.
If you look at all the data we get, the database can have multiple
languages for street names (and city names) but we currently only
use the Dutch part from it. The region that is covered only has
1 official language (Dutch).
> Can you tell us more about where this data comes from? Is it
> government data?
It's provided by the Flemish government.
> It looks like as if a small fraction of the address nodes in your
> converted .osm files have wrong coordinates: There are a bunch of
> nodes somewhere North of Paris (49.2933352/2.3066896), for example
> Mechelsesteenweg 1,5,7 and 71 in
> http://addr.openstreetmap.fr/vlaanderen//100/93145.osm
The website still contains data generated by Ben. I see 430 nodes
in it that don't belong there. But Ben's data is known to have
various issues that are fixed in my file, and my file doesn't
have those nodes there.
Serge Wroclawski wrote:
> 1. What QA have you done on the data to see that it's both good and consistent?
>
> Belgium is a large area and what we have found in the past is that
> data may be great in one place, but very poor in another. How have you
> checked your data?
The data we get actually says how good they think the data is.
But they also know that it contains errors are as far as I
understand are interested in feedback about errors we find in it.
They actually have multiple sources for the address information
and then have a table with what they think it's the best
information and where they got it from. This includes
interpolation for addresses they have no other information for.
Looking at the data from that table, it seems to be correct most
of the time and as far as I have seen matches with the data that
is already in osm.
> 2. What will you be conflating these addresses to?
>
> Do you have building outlines for every building in Belgium?
>
> If not, perhaps it might make more sense to simply include the data as
> a secondary lookup source in a geocoder, such as is done with TIGER in
> Nominatim now?
The government has building outlines for everything in Flanders
but that is not available under a free license (yet).
In osm we do not have all buildings yet, or the buildings that
are drawn might actually be multiple buildings. The import
plan says that buildings should be drawn for each building and
the address information should be added to the building.
> 3. How will you track participants?
>
> How will you know who did what part of the import? Is that something
> your tool does?
The tool allows you to mark something as done, and you can fill in
a name but it's not required.
> 4. How will you perform post-import QA?
>
> The French building import, and the NYC building import have already
> shown that data quality differs greatly between contributor, even if
> the data quality of the upstream data is consistent and high quality.
> How do you plan on maintaining high quality amongst your participants?
I do plan to try and do QA work on it. Comparing our data with
the government data. They also publish there data on a regular
basis so we need to keep track of their changes.
> 4. How will you maintain that users are obeying the rules regarding
> separate import accounts?
I'm not conviced that a seperate account is really needed or
wanted. We're providing a data source but the user that uploads
the data is supposed to do more than just verify and upload it.
The user has to merge it with the building, drawn buildings, may
be fix various other things. All those other things have nothing
to do with the import of the address information.
I really view this as being the same as using Bing imagery for
which there is also no need to create a separate account.
Kurt
More information about the Imports
mailing list