[OSM-dev] 090916 Missing Nodes
brett at bretth.com
Fri Sep 18 03:49:56 BST 2009
Mark Granger wrote:
> As you can see, 24490963 should be between 244090962 and 24490964 but
> has gone AWOL. Since the same node is missing in two different
> versions of the planet file, this indicates that it is getting lost
> during export.
> Remember that these planet files are from the hypercube server which
> merges the daily updates with the weekly planet file from the main
> server. The problem could be with the program that merges the updates
> or it could be in the code that exports the weekly planet file.
It is possible that daily files are missing data. They're running with
a 35 minute delay hour, so if a single changeset upload takes longer
than that time it might not be included in the daily diff. The minute
diffs running at a 5 minute delay are frequently losing data.
The newer replication diffs should fix this, but they're not quite ready
for general consumption yet.
The problem with the newer diffs is that they are not timestamp aligned
so it is not easy to tell exactly what data they include, and they
cannot be re-generated if problems occur.
I'm thinking that a combined approach of old style diffs and new style
diffs may be required. Leave the existing daily diffs running with a
much longer delay to ensure they don't miss data (eg. 12 hours).
Produce new style transaction id based diffs running every minute.
Create hourly diffs by rolling up the transaction id diffs. Something
like that anyway. Keeping a local db up to date will involve
downloading and importing a planet, then patching it with daily diffs to
the current day, then patching it with new style hourly/minute diffs to
keep up to date. Switching from timestamp aligned diffs to transaction
id diffs is imprecise so an overlap period of several hours would be
New osmosis tasks will be available to simplify some of this processing.
More information about the dev