[OSM-dev] XML planets (was Re: New proposed directory layout for planet.openstreetmap.org)

Roland Olbricht roland.olbricht at gmx.de
Fri Sep 7 08:28:23 BST 2012


Hi,
 
> Overpass is an amazing resource, but I can't believe it relies on a
> XML dump of the database being released every two weeks?  How does
> that work?

The planet file is necessary for the first startup. Afterwards, it can work forever solely on minute diffs. And new instances can be cloned from the exisiting instance over a public interface, without a planet file.

This initial startup has usually the following timing:

2 hours to download planet.osm.bz2

12-24 hours to import planet.osm.bz2 into the database. This contains various optimizations that are fine tuned on the properties of the XML planet structure and maybe gets worse with a differently organised file format.

4-30 hours to catch up by applying the minute diffs. This depends on how many days the planet file is old (always at least two days, one for the planet file itself, a second because we have done the above import procedure).

Altogether, there is not much point in "just improving the download time", because it has a diminishing share of the total time needed. It would be by far more important to get the import step again fast.

I doubt that converting the PBF into a XML is done in the saved half an hour on the download, so the simplest solution to convert after download is surely a slowdown.

> Yes, about 25 GB. Every two weeks.

Ok, we come closer to the problem. I agree that probably very few people need old planet files. I deem the following useful to have

- the last CC-BY-SA Planet
- a CC-BY-SA full history planet up to that date

- an ODbL full history planet from time to time (lets say four times a year or so)
- the latest two planet files

These are altogether less than 300 GB compressed XML, so I assume this manageable unless the server administrators tell otherwise. Then I would rather offer to host the above mentioned files than to drop them.

Best,

Roland



More information about the dev mailing list