[OSM-talk] Fwd: Addresses are a tiny fraction of what we do (was: The world's best addressable map)

Christian Quest cquest at openstreetmap.fr
Fri Oct 24 19:38:12 UTC 2014


Addresses in France...

We started a project to collect addresses on a separate database called
"BANO" (Base d'Adresses Nationale Ouverte : Open National Address Database).

We've recreated data from the national cadastre (scrapping 1.3 millions PDF
files), opendata source and... OSM.

This database contains 15+ millions addresses so far, and we added almost 4
millions hamlet and locality names recently.
A full dump contains 19.7 millions locations ranging from housenumber to
municipalities (no POI).

Why we did it that way ?

Import of millions of address can be done quick and dirty in a couple of
days, but such a "blind" import does not really fit the import policy and
we also learned from the TIGER import that fixing data is much less fun
than creating new data.

Why import all this if the data is available (under ODbL) ?

It seems much better to take the required time to import these data street
by street, reviewing it to make sure we improve its quality and not just
copy it. This will take years, many years (from 5 to 20) depending on how
deep to review the data before the upload. Some contributors have started
this work, but it is really boring and I don't expect we can attract a
large bunch of contributors on that project.

Anyway, BANO updates its content every night and collects new OSM addresses
to replace other sources. So it also take advantage of address
reviewing/fixing done in OSM during this import process or during any
address related contribution.

What is much more interesting is that OSM contributors can use BANO to
detect missing roads/streets and names (we have a BANO tiled overlay
showing missing names like here
http://layers.openstreetmap.fr/?zoom=18&lat=48.8474&lon=3.23191&layers=B0000FFFFFFFFFFFFFFFFFFFFFT
).
This seems much more useful as we're far from having all roads and streets
mapped and named in France.

We can even see this "BANO effect" on some graphs:
http://osm2020.free.fr/qa-commune/popu-sans-route-name-france.png

Yes, something happened last may... BANO started to be available at that
time and the population for which no nearby named road was present as
decreased almost twice faster since then.

You can see also the missing names graph here:
http://munin.openstreetmap.fr/osm12.free.org/osm104.openstreetmap.fr/bano_rapproche.html
More than 100.000 names have been added since may.


To summarize... yes, address are really an important dataset, mainly
because it allows to cross the boundary between non geographic data (postal
addresses) and geographic data with the help of (good) geocoding algorithm.
This allows to bring a lot of new data users to OSM by providing the data
fuel for services like routing from address A to address B. Some public
services web sites have started using OSM + BANO that way.
This also allows to geocode new (open) datasets to improve OSM with more
interesting data (we're about to do this for almost 30000 pharmacy).

Is it mandatory to have the huge address datasets in OSM ?
Maybe not, and not if the import process does not bring any improvement to
the data.
Mappers' time seems to me much better used for less mechanical
contributions.

-- 
Christian Quest - OpenStreetMap France
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20141024/a4ab531f/attachment-0001.html>


More information about the talk mailing list