[OSM-dev] Missing nodes in 200904210852-200904210853.osc.gz (should be 200905020852-200905020853.osc.gz)

Brett Henderson brett at bretth.com
Mon May 4 13:41:57 BST 2009


Hi Everybody,

It appears that there are some further warts in the osmosis diffs.  But 
this time it only impacts the minute diffs.

As per Alfon's message below there is some missing data in the 
200905020852-200905020853.osc.gz minute diff.  The missing data appears 
to belong to changeset 1045077 which is a monster containing a large 
number of entities.  My best guess is that the changeset took a long 
time to insert (perhaps rails was choking on it for some time?) and as a 
result the final commit occurred more than 5 minutes after the initial 
data was added.  This meant that the osmosis extraction occurred before 
the data became visible.  The hourly changeset which runs 30 minutes 
later included the changeset so whatever the problem was had corrected 
itself by that time.

**** Correcting The Problem ****
If you have a database using the minute diffs, the best option is to 
reset the timestamp back to some time before May 2nd, 8am and catch up 
using hourly or daily diffs.  From there resume processing with minute 
diffs.

**** Future Avoidance ****
Unfortunately with the current method of extracting diffs, there is 
always a risk this may occur.  With the current minute lag interval of 5 
minutes it is very rare, but not impossible.  I am now setting up 
another minute diff process running 30 minutes behind the API which I'll 
use to audit the minute diff process.  At least this way I'll know if 
they occur again.  If it is a regular occurrence then a better solution 
will have to be devised.  If it never happens again then I'll put it 
down to cosmic rays or a 0.6 wrinkle that has since been fixed.

Brett

a_a at gmx.de wrote:
> Hi Brett,
>
> Brett Henderson wrote:
>> Hi Alfons,
>>
>> Where did you get the minute files?  They're not available on 
>> planet.openstreetmap.org any longer.  There were some problems when 
>> API 0.6 was first deployed, but the problem was corrected and the 
>> problem change files were re-generated.  Is it possible you have some 
>> of the bad files produced during that period?  I'm not aware of any 
>> problems with the files currently being produced.
> Damn, it was the wrong filename :-( (My mistake, I just looked at the 
> time "0852")
>
> It should be "200905020852-200905020853.osc", but at least the data 
> posted below is correct. (see timestamp "2009-05-02T08:52:22Z")
>
>
>
>> a_a at gmx.de wrote:
>>> Hello Brett,
>>>
>>> it seems to me that there are several nodes missing in
>>>
>>> 200904210852-200904210853.osc.gz
>>>
>>> (taken from minute diffs) e.g. especially 388501322, 388501324 and 
>>> 388501325.
>>>
>>> Looking at lines 52-57
>>>
>>>     <node id="388501275" version="1" 
>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen Panther" 
>>> lat="49.0172287" lon="11.4147053"/>
>>>     <node id="388501276" version="1" 
>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen Panther" 
>>> lat="49.0170374" lon="11.413804"/>
>>>     <node id="388501277" version="1" 
>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen Panther" 
>>> lat="49.0167492" lon="11.4139295"/>
>>>     <node id="388501426" version="1" 
>>> timestamp="2009-05-02T08:52:33Z" uid="52495" user="seawolff" 
>>> lat="54.5844367" lon="9.8205745"/>
>>>     <node id="388501487" version="1" 
>>> timestamp="2009-05-02T08:52:43Z" uid="45565" user="flinki" 
>>> lat="53.9425242" lon="11.3160177"/>
>>>     <node id="388501488" version="1" 
>>> timestamp="2009-05-02T08:52:43Z" uid="45565" user="flinki" 
>>> lat="53.9422031" lon="11.3166454"/>
>>>
>>> from "200904210852-200904210853.osc" it seems that many more are 
>>> missing.
>>> And for the ways at least ways 33909155 and 33909185 are also 
>>> missing in that minute file.
>>> Do you have any clue why?
>>>
>>>
>>> Thanks in advance and best regards
>>>
>>> Alfons
>>
>





More information about the dev mailing list