[Geocoding] how to estimate hardware needed for nominatim osm2pgsql

Paul Norman penorman at mac.com
Thu Mar 1 17:41:54 UTC 2018


On 3/1/2018 4:46 AM, Josip Rodin wrote:
> I observed an issue with Nominatim import performance that is described
> athttps://github.com/openstreetmap/Nominatim/issues/954
>
> Long story short - on the same machine, ways processing is egregiously
> slower than node processing; when I import the nodes for all of Europe,
> the ways get processed at a rate of 30/s; when I import the nodes for
> just France, the ways get processed at 27700/s.
>
> How does one go about debugging that? The documentation doesn't help much.
>
> I could let it drag along once at 30/s, but I don't want the same situation
> to persist with the updates later, which could render the whole setup useless.
>
> There's reports online that avoiding OVH Ceph storage would help. Would it?
> There doesn't seem to be any obvious variation to its behavior that could
> be directly attributed to it.
>
> There's an implication in the hardware requirements that having more memory
> would be helpful. Would that do the trick? I'm seeing the same memory usage
> graph with both inputs.

It's slow during the osm2pgsql import stage. General advice for 
osm2pgsql applies here. For a large import, you want more RAM. Ideally, 
you should have enough cache to fit all the node positions in RAM. For 
Europe, this is probably 20GB to 25GB on a machine with 32GB of RAM.

You cannot compare nodes processed per second to ways processed per 
second and say one is faster than the other. They're measuring different 
things.

If you're using some kind of cloud storage, IO latency is likely a big 
issue. Cloud storage typically can only support good IOPS with a high 
queue depth and/or many requests in parallel. Fortunately, if you're 
using cloud storage, it's normally easy to get a machine with enough RAM 
for the import, then switch it for regular use. Keep in mind that even 
with regular use, database workloads like Nominatim perform best with 
plenty of RAM.



More information about the Geocoding mailing list