[Rebuild] Strategy for running the bot in "regions"

Andy Allan gravitystorm at gmail.com
Mon Jun 25 11:13:06 BST 2012


Hi All,

For those of you who aren't following the github commits closely,
we've now got a version of the bot that can run against a rails_port
instance and calculate and apply both the changeset (to edit/delete
data as appropriate) and apply redactions (to historical versions).

This version works on a "whole database" scale, i.e. examining all the
data in the database in one go. Obviously while this is fine for
initial testing, it's not suitable for handling the production
database.

There's been previous agreement on the list that we want to run the
bot "per region", rather than e.g. by id range (first thousand nodes,
second thousand etc). Has anyone had any thoughts as to how to select
all the data in a region? And how to process region-by-region while
making sure we (eventually) process everything? Map calls are one
approach, but throw up complications around the area of nested
relations and general resource usage (it takes a lot of calls to cover
Ireland, for example). Alternatively, the bot could read from the
database directly, and build its own idea of what data is in a region.

If anyone has pondered this before, or wants to do so now, I'd love to
hear about it (either here, on on #osm-dev).

Cheers,
Andy



More information about the Rebuild mailing list