[OSM-dev] Bulk batch address search

Julien Cochennec trblft at gmail.com
Mon Jun 11 22:04:46 UTC 2018


Hi,
I work for a big stats institute that have millions of addresses stored in
a Oracle database.
Data interacts with a SQL/JAVA search engine that is almost impossible to
port.
We can't afford to pay this system anymore and only have a few months,
maybe more than a year, to switch to a different system.
Our software takes addresses in big files from external providers, add
geocoding data and stats to each address and return the extended data to
providers as bigger files.

We need to switch to PostGreSQL, so I was thinking about :
- turning our adresses data into OSM format
- turning our non geo data (administrative confidential data) in tags
related to geo addresses data
- putting all this on our own nominatim instance server with only french
addresses
- developing a web interface based on existing OSM tools
- developing scripts that would make the match evaluation between provider
address and nominatim address database

So I need to know if it's possible to make millions of search in a bulk
process, via nominatim, in command line, from a big input file (let's say
csv) in a few hours, less than a whole night, searching through only french
addresses. And how do I do that? I saw things about GeoPy but I don't want
to slow the process with web API, just terminal.

I guess there are less than 100 millions of addresses in our database. But
providers sometime give 3millions addresses in a file.

It would be a win/win as we could become a great contributor to OSM having
all our data in OSM format and also use almost all tools OSM has already
provided. We already give some info like city administrative borders/shapes
via OpenData program.

Thanks all for your help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20180612/dc9c4e7f/attachment.html>


More information about the dev mailing list