[OSM-dev] 090916 Missing Nodes

Mark Granger grangerfx at gmail.com
Fri Sep 18 04:02:21 BST 2009


> It is possible that daily files are missing data.  They're running with a 
> 35 minute delay hour, so if a single changeset upload takes longer than 
> that time it might not be included in the daily diff.  The minute diffs 
> running at a 5 minute delay are frequently losing data.

So how could the planet files generated on two different days have a mostly
identical set of missing nodes?

I hope that the newer diffs do solve this problem because currently it is
impossible to get a complete snapshot of the database without at least some
missing nodes. I hate to delete entire ways just because they are missing a
few nodes when I import the planet data.

If noone objects, I will continue to provide updates of the missing list of
nodes in ways to this list from time to time. I still think it is possible
that there could be some bug with the export or merge software (but don't
mind being proven wrong).

-Mark Granger
----- Original Message ----- 
From: "Brett Henderson" <brett at bretth.com>
To: "Mark Granger" <grangerfx at gmail.com>
Cc: "OSM Dev List" <dev at openstreetmap.org>
Sent: Thursday, September 17, 2009 7:49 PM
Subject: Re: [OSM-dev] 090916 Missing Nodes


> Mark Granger wrote:
>> As you can see, 24490963 should be between 244090962 and 24490964 but has 
>> gone AWOL. Since the same node is missing in two different versions of 
>> the planet file, this indicates that it is getting lost during export.
>>  Remember that these planet files are from the hypercube server which 
>> merges the daily updates with the weekly planet file from the main 
>> server. The problem could be with the program that merges the updates or 
>> it could be in the code that exports the weekly planet file.
> It is possible that daily files are missing data.  They're running with a 
> 35 minute delay hour, so if a single changeset upload takes longer than 
> that time it might not be included in the daily diff.  The minute diffs 
> running at a 5 minute delay are frequently losing data.
>
> The newer replication diffs should fix this, but they're not quite ready 
> for general consumption yet.
> http://planet.openstreetmap.org/minute-replicate/
>
> The problem with the newer diffs is that they are not timestamp aligned so 
> it is not easy to tell exactly what data they include, and they cannot be 
> re-generated if problems occur.
>
> I'm thinking that a combined approach of old style diffs and new style 
> diffs may be required.  Leave the existing daily diffs running with a much 
> longer delay to ensure they don't miss data (eg. 12 hours).  Produce new 
> style transaction id based diffs running every minute.  Create hourly 
> diffs by rolling up the transaction id diffs.  Something like that anyway. 
> Keeping a local db up to date will involve downloading and importing a 
> planet, then patching it with daily diffs to the current day, then 
> patching it with new style hourly/minute diffs to keep up to date. 
> Switching from timestamp aligned diffs to transaction id diffs is 
> imprecise so an overlap period of several hours would be advisable.
>
> New osmosis tasks will be available to simplify some of this processing.
>
> Brett
> 





More information about the dev mailing list