[OSM-dev] Missing nodes in 200904210852-200904210853.osc.gz (should be 200905020852-200905020853.osc.gz)
Brett Henderson
brett at bretth.com
Mon May 4 13:41:57 BST 2009
Hi Everybody,
It appears that there are some further warts in the osmosis diffs. But
this time it only impacts the minute diffs.
As per Alfon's message below there is some missing data in the
200905020852-200905020853.osc.gz minute diff. The missing data appears
to belong to changeset 1045077 which is a monster containing a large
number of entities. My best guess is that the changeset took a long
time to insert (perhaps rails was choking on it for some time?) and as a
result the final commit occurred more than 5 minutes after the initial
data was added. This meant that the osmosis extraction occurred before
the data became visible. The hourly changeset which runs 30 minutes
later included the changeset so whatever the problem was had corrected
itself by that time.
**** Correcting The Problem ****
If you have a database using the minute diffs, the best option is to
reset the timestamp back to some time before May 2nd, 8am and catch up
using hourly or daily diffs. From there resume processing with minute
diffs.
**** Future Avoidance ****
Unfortunately with the current method of extracting diffs, there is
always a risk this may occur. With the current minute lag interval of 5
minutes it is very rare, but not impossible. I am now setting up
another minute diff process running 30 minutes behind the API which I'll
use to audit the minute diff process. At least this way I'll know if
they occur again. If it is a regular occurrence then a better solution
will have to be devised. If it never happens again then I'll put it
down to cosmic rays or a 0.6 wrinkle that has since been fixed.
Brett
a_a at gmx.de wrote:
> Hi Brett,
>
> Brett Henderson wrote:
>> Hi Alfons,
>>
>> Where did you get the minute files? They're not available on
>> planet.openstreetmap.org any longer. There were some problems when
>> API 0.6 was first deployed, but the problem was corrected and the
>> problem change files were re-generated. Is it possible you have some
>> of the bad files produced during that period? I'm not aware of any
>> problems with the files currently being produced.
> Damn, it was the wrong filename :-( (My mistake, I just looked at the
> time "0852")
>
> It should be "200905020852-200905020853.osc", but at least the data
> posted below is correct. (see timestamp "2009-05-02T08:52:22Z")
>
>
>
>> a_a at gmx.de wrote:
>>> Hello Brett,
>>>
>>> it seems to me that there are several nodes missing in
>>>
>>> 200904210852-200904210853.osc.gz
>>>
>>> (taken from minute diffs) e.g. especially 388501322, 388501324 and
>>> 388501325.
>>>
>>> Looking at lines 52-57
>>>
>>> <node id="388501275" version="1"
>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen Panther"
>>> lat="49.0172287" lon="11.4147053"/>
>>> <node id="388501276" version="1"
>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen Panther"
>>> lat="49.0170374" lon="11.413804"/>
>>> <node id="388501277" version="1"
>>> timestamp="2009-05-02T08:52:22Z" uid="62236" user="Paulchen Panther"
>>> lat="49.0167492" lon="11.4139295"/>
>>> <node id="388501426" version="1"
>>> timestamp="2009-05-02T08:52:33Z" uid="52495" user="seawolff"
>>> lat="54.5844367" lon="9.8205745"/>
>>> <node id="388501487" version="1"
>>> timestamp="2009-05-02T08:52:43Z" uid="45565" user="flinki"
>>> lat="53.9425242" lon="11.3160177"/>
>>> <node id="388501488" version="1"
>>> timestamp="2009-05-02T08:52:43Z" uid="45565" user="flinki"
>>> lat="53.9422031" lon="11.3166454"/>
>>>
>>> from "200904210852-200904210853.osc" it seems that many more are
>>> missing.
>>> And for the ways at least ways 33909155 and 33909185 are also
>>> missing in that minute file.
>>> Do you have any clue why?
>>>
>>>
>>> Thanks in advance and best regards
>>>
>>> Alfons
>>
>
More information about the dev
mailing list