[OSM-dev] Daily Planet.osm
nick at nickhill.co.uk
Tue May 1 18:20:33 BST 2007
I completely agree that we need instant. 24 hours is not good enough.
If tiles are edited through the API, the changes should reflect in the slippy
map within minutes (or tens of minutes at busy times). This is not impossible.
I presented a tile invalidation scheme to the list several months ago. This was
widely accepted as a good solution.
Using an algorithm like the one I used to sort the database, API updates can
point to a 32 bit tile. The tile is marked dirty on a special table and a
timestamp set. A simple scheduling system works through tiles re-rendering those
which are dirty in sequence. Given the new API will have a session, the tile
table need only be updated when the session is being closed. Database and CPU
load for the scheme would be very small.
As an improvement to the above scheme, the tile database can be incremented when
the tile is viewed, and increase the priority for the busiest or actively viewed
Such a tile invalidation scheme can easily form the basis of a real-time update
feed which does not impact the database.
This is achieved by copying the data used for each tile about to be re-rendered
to a file in a directory. The file represents the tile. Any request for the
current update returns a .tar.gz containing all the .osm tile files produced
since the last planet.
We'd therefore have availability of a perfectly up to date planet file (after
patching) at all times, and an up to date slippy map.
Steve is very kindly re-implementing the API on Rails, which should open the
door for more-widespread API hacking and a faster development loop for OSM. That
was really the raison d'etre of dev, but it hasn't led to the widespread API
hacking originally envisaged.
Frederik Ramm wrote:
> I am a fervent advocate of instant access for every imaginable type
> of request. Give me all data for Australia, as it is NOW, not as it was
> an hour, a day, or a week ago! Databases are for direct access; anything
> else must surely be a compromise.
> That said, I fully understand that we have to make many compromises at
> the moment (but that should not cloud the vision!).
> Nick Hill asked:
>> Where is a daily dump useful, where a weekly dump is not?
> Mostly for map rendering, which is one of our main "shopping windows" to
> the outside world. Every mapnik layer renders off a planet dump. A daily
> dump is therefore useful for the following six things:
> * less than 24h old maps on Thursday
> * less than 24h old maps on Friday
> * less than 24h old maps on Saturday
> * less than 24h old maps on Sunday
> * less than 24h old maps on Monday
> * less than 24h old maps on Tuesdays
> The tiles at home rendering layer would also be affected. As most of you
> know, even though we have the RSS feed and the ability to manually
> request rendering of tiles, a lot of changes are missed. Most Wednesdays
> I take the planet file and the full t at h tile statistics file (issued
> daily), and compare last modified times for the area of each tile, to
> find out which need re-rendering. At times, I have had to re-request as
> much as 10,000 tiles to bump the t at h layer up to a resonably current
> view - which led to a big workload for all parties involved (database
> server, t at h tile server, and clients) on Wednesdays and Thursdays, that
> would much better be evenly spaced.
>> And is a daily dump
>> likely to lead to an overall benefit to OSM?
> If you count our image as viewed from the outside, yes, current maps are
> important. If I describe the project to others, it is obvious that the
> main fascination there is that "this is a map you can edit" - and people
> expect the "edit" to bear visible fruit. It is bad enough that such
> fruit needs half an hour to ripen with the tiles at home layer, but telling
> people that the edit I just made will show up on the main map "sometime
> after next Wednesday" really bursts a lot of soap bubbles. If I could
> say "tomorrow", that would still not be perfect, but make a difference.
> Every little helps ;-)
> --Frederik Ramm ## eMail frederik at remote.org ## N49°00.09' E008°23.33'
More information about the dev