[OSM-dev] 090916 Missing Nodes
Mark Granger
grangerfx at gmail.com
Fri Sep 18 04:02:21 BST 2009
> It is possible that daily files are missing data. They're running with a
> 35 minute delay hour, so if a single changeset upload takes longer than
> that time it might not be included in the daily diff. The minute diffs
> running at a 5 minute delay are frequently losing data.
So how could the planet files generated on two different days have a mostly
identical set of missing nodes?
I hope that the newer diffs do solve this problem because currently it is
impossible to get a complete snapshot of the database without at least some
missing nodes. I hate to delete entire ways just because they are missing a
few nodes when I import the planet data.
If noone objects, I will continue to provide updates of the missing list of
nodes in ways to this list from time to time. I still think it is possible
that there could be some bug with the export or merge software (but don't
mind being proven wrong).
-Mark Granger
----- Original Message -----
From: "Brett Henderson" <brett at bretth.com>
To: "Mark Granger" <grangerfx at gmail.com>
Cc: "OSM Dev List" <dev at openstreetmap.org>
Sent: Thursday, September 17, 2009 7:49 PM
Subject: Re: [OSM-dev] 090916 Missing Nodes
> Mark Granger wrote:
>> As you can see, 24490963 should be between 244090962 and 24490964 but has
>> gone AWOL. Since the same node is missing in two different versions of
>> the planet file, this indicates that it is getting lost during export.
>> Remember that these planet files are from the hypercube server which
>> merges the daily updates with the weekly planet file from the main
>> server. The problem could be with the program that merges the updates or
>> it could be in the code that exports the weekly planet file.
> It is possible that daily files are missing data. They're running with a
> 35 minute delay hour, so if a single changeset upload takes longer than
> that time it might not be included in the daily diff. The minute diffs
> running at a 5 minute delay are frequently losing data.
>
> The newer replication diffs should fix this, but they're not quite ready
> for general consumption yet.
> http://planet.openstreetmap.org/minute-replicate/
>
> The problem with the newer diffs is that they are not timestamp aligned so
> it is not easy to tell exactly what data they include, and they cannot be
> re-generated if problems occur.
>
> I'm thinking that a combined approach of old style diffs and new style
> diffs may be required. Leave the existing daily diffs running with a much
> longer delay to ensure they don't miss data (eg. 12 hours). Produce new
> style transaction id based diffs running every minute. Create hourly
> diffs by rolling up the transaction id diffs. Something like that anyway.
> Keeping a local db up to date will involve downloading and importing a
> planet, then patching it with daily diffs to the current day, then
> patching it with new style hourly/minute diffs to keep up to date.
> Switching from timestamp aligned diffs to transaction id diffs is
> imprecise so an overlap period of several hours would be advisable.
>
> New osmosis tasks will be available to simplify some of this processing.
>
> Brett
>
More information about the dev
mailing list