[OSM-dev] Extracting Great Britain and Ireland from planet.osm

Keith Sharp kms at passback.co.uk
Tue Mar 13 11:02:23 GMT 2007

On Tue, 2007-03-13 at 09:36 +0000, Keith Sharp wrote:
> Hello,
> Following on from last weeks discussion about reducing the size of the
> planet.osm data set for particular use cases I have modified a script
> written by Frederik Ramm that extracted data for Germany so that it
> extracts data for Great Britain, Ireland, the Channel Islands, the Isles
> of Scilly, St Kilda, Orkney, and Shetland.  The polygon I used can be
> seen:
> 	http://www.passback.co.uk/maps/gbirl.html
> You can download the script from:
> 	http://www.passback.co.uk/maps/extract-gb-irl.pl
> You run the script as follows:
> 	./extract-gb-irl.pl < planet-070307.osm > gb-irl.osm
> On my system a run took just over 5 minutes wall clock time, I have not
> looked at memory usage in any detail, but it looked about 50MB max
> resident size in top (not the best measure, I know).  The uncompressed
> file is reduced from 3.40Gb to 0.22Gb, bzip2 compressed size is 16Mb.

WARNING - there was a bug!  Frederik pointed out in a private email that
the script was reducing the number of points in the polygon - a feature
necessary for his 5000 point Germany polygon, but not necessary for the
10 point GB polygon.  This led to less data being extracted that I

I've uploaded a fixed version, but the numbers look a little bigger:

	- 6 minutes 10 seconds to extract
	- 0.54Gb uncompressed
	- 42Mb bzip2 compressed

I think it would be great if this could be run to post-process the
planet dump each week - it would probably trade bandwidth usage for disk
space on the server.


