[OSM-dev] Missing nodes in 200904210852-200904210853.osc.gz (should be 200905020852-200905020853.osc.gz)

Brett Henderson brett at bretth.com
Mon May 4 15:24:18 BST 2009

In addition to the normal minute diffs, there are now "slow" minute 
diffs available here running 30 minutes behind the main API:

They are not intended to be used directly, but if you have doubts about 
the contents of the standard minute diffs please check these files as 
well.  If their contents differ from the main diffs then a transaction 
has been committed too late to the database to be included in the 
osmosis changeset.  The only option is then to switch to the delayed 
changeset until the problem period is passed then switch back to the 
main changesets.  If I'm around I'll re-generate the minute changesets 
straight away but the chances I'll be around are fairly small because 
the busy periods are typically not in my waking hours.

I have an audit process now comparing the results of the two minute 
processes and I'll send an email around if I detect any anomalies.  I 
may make the results of this audit process public when I get time.

For interest sake, there is also an experimental set of "fast" minute 
diffs running 1 minute behind the API but please don't use them for 
production systems.  The link is below:

If anybody has any questions or suggestions please let me know.

Brett Henderson wrote:
> Hi Everybody,
> It appears that there are some further warts in the osmosis diffs.  
> But this time it only impacts the minute diffs.
> As per Alfon's message below there is some missing data in the 
> 200905020852-200905020853.osc.gz minute diff.  The missing data 
> appears to belong to changeset 1045077 which is a monster containing a 
> large number of entities.  My best guess is that the changeset took a 
> long time to insert (perhaps rails was choking on it for some time?) 
> and as a result the final commit occurred more than 5 minutes after 
> the initial data was added.  This meant that the osmosis extraction 
> occurred before the data became visible.  The hourly changeset which 
> runs 30 minutes later included the changeset so whatever the problem 
> was had corrected itself by that time.
> **** Correcting The Problem ****
> If you have a database using the minute diffs, the best option is to 
> reset the timestamp back to some time before May 2nd, 8am and catch up 
> using hourly or daily diffs.  From there resume processing with minute 
> diffs.
> **** Future Avoidance ****
> Unfortunately with the current method of extracting diffs, there is 
> always a risk this may occur.  With the current minute lag interval of 
> 5 minutes it is very rare, but not impossible.  I am now setting up 
> another minute diff process running 30 minutes behind the API which 
> I'll use to audit the minute diff process.  At least this way I'll 
> know if they occur again.  If it is a regular occurrence then a better 
> solution will have to be devised.  If it never happens again then I'll 
> put it down to cosmic rays or a 0.6 wrinkle that has since been fixed.
> Brett
> a_a at gmx.de wrote:
>> Hi Brett,
>> Brett Henderson wrote:
>>> Hi Alfons,
>>> Where did you get the minute files?  They're not available on 
>>> planet.openstreetmap.org any longer.  There were some problems when 
>>> API 0.6 was first deployed, but the problem was corrected and the 
>>> problem change files were re-generated.  Is it possible you have 
>>> some of the bad files produced during that period?  I'm not aware of 
>>> any problems with the files currently being produced.
>> Damn, it was the wrong filename :-( (My mistake, I just looked at the 
>> time "0852")
>> It should be "200905020852-200905020853.osc", but at least the data 
>> posted below is correct. (see timestamp "2009-05-02T08:52:22Z")
>>> a_a at gmx.de wrote:
>>>> Hello Brett,
>>>> it seems to me that there are several nodes missing in
>>>> 200904210852-200904210853.osc.gz
>>>> (taken from minute diffs) e.g. especially 388501322, 388501324 and 
>>>> 388501325.
>>>> Looking at lines 52-57
>>>>     <node id="388501275" version="1" 
>>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen 
>>>> Panther" lat="49.0172287" lon="11.4147053"/>
>>>>     <node id="388501276" version="1" 
>>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen 
>>>> Panther" lat="49.0170374" lon="11.413804"/>
>>>>     <node id="388501277" version="1" 
>>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen 
>>>> Panther" lat="49.0167492" lon="11.4139295"/>
>>>>     <node id="388501426" version="1" 
>>>> timestamp="2009-05-02T08:52:33Z" uid="52495" user="seawolff" 
>>>> lat="54.5844367" lon="9.8205745"/>
>>>>     <node id="388501487" version="1" 
>>>> timestamp="2009-05-02T08:52:43Z" uid="45565" user="flinki" 
>>>> lat="53.9425242" lon="11.3160177"/>
>>>>     <node id="388501488" version="1" 
>>>> timestamp="2009-05-02T08:52:43Z" uid="45565" user="flinki" 
>>>> lat="53.9422031" lon="11.3166454"/>
>>>> from "200904210852-200904210853.osc" it seems that many more are 
>>>> missing.
>>>> And for the ways at least ways 33909155 and 33909185 are also 
>>>> missing in that minute file.
>>>> Do you have any clue why?
>>>> Thanks in advance and best regards
>>>> Alfons

More information about the dev mailing list