[OSRM-talk] osrm-extract taking hours to complete

Kieran Caplice kieran.caplice at temetra.com
Thu Mar 3 09:49:17 UTC 2016


I realise now I did in fact send my last email to the list, rather than 
to Patrick directly....no harm done! The info might be useful to others 
anyway.

Thanks Bjorn, that's very helpful. Our extract took just over 7 hours 
yesterday, which isn't as long as I thought it would take, so we'll 
probably just schedule it to run every weekend or so and move the files 
to the correct location when finished.

Kind regards,
Kieran Caplice

On 03/03/16 09:23, Björn Semm wrote:
> Hi Kieran,
>
> we run an OSRM update (planet) once a week on a central instance and copy the generated files to diffrent environments.
>
> osrm at box:~$ ./osrm-update-planet-files.sh
>      Checking for md5sum [OK]
>      Checking for osrm-extract [OK]
>      Checking for osrm-prepare [OK]
>      Checking for tar [OK]
>      Checking for wget [OK]
>      Downloading planet-latest.osm.pbf.md5 ...  [OK]
>      Downloading http://planet.osm.org/pbf/planet-latest.osm.pbf ... [OK]
>      Verifying md5 checksum of planet-latest.osm.pbf ... [OK]
>      Starting osrm-extract at Wed Mar  2 11:57:41 CET 2016...
>      Finished osrm-extract at Thu Mar  3 00:21:34 CET 2016!
>      Starting osrm-prepare at Thu Mar  3 00:21:34 CET 2016...
>      Finished osrm-prepare at Thu Mar  3 09:21:23 CET 2016!
>      Removing old extracts from /data/current ... empty [OK]
>      Copying new generated files to /data/current ... [OK]
>      Renaming files in /data/current with Prefix 201609 ... [OK]
>      Creating md5 checksum over all 201609_planet-latest* ... [OK]
>      Compressing 201609_planet-latest* to 201609_planet-latest.tar.gz ... [OK]
>      Determining if test or prod env is the target ... TEST [OK]
>      Copying new generated files to /mnt/osrm-extract (TEST) ...  [OK]
>      Cleaning up /mnt/osrm-extract ... [OK]
>      Cleanup /data/planet-latest.osm.pbf ... [OK]
>
> On a VM with 96GB RAM, 4 Cores and a RAID5 (HDD) it took about 12,5 hours to extract and 9 hours to prepare.
> SWAP is 100GB, stxxl=disk=/data/stxxl,250000,syscall
>
> We currently use Version 4.9.0 of osrm-backend.
>
> BR
> Björn
>
> ________________________________________
> Von: Kieran Caplice <kieran.caplice at temetra.com>
> Gesendet: Mittwoch, 2. März 2016 18:23
> An: osrm-talk at openstreetmap.org
> Betreff: Re: [OSRM-talk] osrm-extract taking hours to complete
>
> Hi Patrick,
>
> That makes sense then. It's obvious the process is just going to take
> upwards of 8-10 hours for us in that case.
>
> Thanks for the help.
>
> Kind regards,
> Kieran Caplice
>
> On 02/03/16 17:01, Patrick Niklaus wrote:
>> Hey Kieran,
>>
>>
>> there have been a lot of structural changes (e.g. moving code from
>> osrm-prepare into osrm-extract) that probably invalidate that numbers.
>> Also we support 64bit OSM ids now, which sadly uses a lot more disk
>> space. I think stxxl need like 200GB. I think on our setup we have a
>> turn-around of 6 hours for the planet dataset on an SSD setup (car
>> profile, any other profile needs significantly longer). You should
>> probably think about updating your hard drives as this is IO bound. At
>> your current read/write speed it will already take more than an hour
>> to just write 200GB of data once. We scan it at least twice just for
>> pre-processing.
>>
>> Cheers,
>> Patrick
>>
>>
>> On Wed, Mar 2, 2016 at 5:51 PM, Kieran Caplice
>> <kieran.caplice at temetra.com> wrote:
>>> Hello,
>>>
>>> I'm currently extracting the planet PBF (~31 GB), and it's been running for
>>> hours. I notice in the "Running OSRM" wiki page, it says " On a Core i7 with
>>> 8GB RAM and (slow) 5400 RPM Samsung SATA hard disks it took about 65 minutes
>>> to do so from a PBF formatted planet", which is making me wonder why it's
>>> taking so long on our server. Below are some example output messages:
>>>
>>> [info] Parsing finished after 3584.35 seconds
>>> [extractor] Erasing duplicate nodes   ... ok, after 319.091s
>>> [extractor] Sorting all nodes   ... ok, after 3632.87s
>>> [extractor] Building node id map      ... ok, after 2025.29s
>>> [extractor] Confirming/Writing used nodes     ... ok, after 1096.24s
>>> [extractor] Sorting edges by start    ... ok, after 2000.08s
>>>
>>> Some stxxl errors were outputted as I set the disk size to 100GB thinking it
>>> was enough - but I didn't think it would cause such slowdowns as this,
>>> considering extracting the Europe PBF takes hours also without the stxxl
>>> errors.
>>>
>>> Server specs:
>>> Ubuntu 14.04
>>> Intel Xeon CPU E5-1650 v3 @ 3.50GHz  (hex-core with HT)
>>> 64 GB RAM @ 2133 MHz
>>> 2 TB Western Digital Enterprise 7200 RPM hard drive
>>>
>>> At the moment, disk IO is averaging around 35-40 MB/s R/W (~90%).
>>>
>>> Anyone have any ideas as to what might be going on? Or is it normal to take
>>> this long without an SSD?
>>>
>>> Thanks in advance.
>>>
>>> Kind regards,
>>> Kieran Caplice
>>>
>>>
>>> _______________________________________________
>>> OSRM-talk mailing list
>>> OSRM-talk at openstreetmap.org
>>> https://lists.openstreetmap.org/listinfo/osrm-talk
>>>
>> _______________________________________________
>> OSRM-talk mailing list
>> OSRM-talk at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/osrm-talk
>
> _______________________________________________
> OSRM-talk mailing list
> OSRM-talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/osrm-talk




More information about the OSRM-talk mailing list