[osmosis-dev] osmchange file produced by osmconvert gives osmosis error
Igor Podolskiy
igor.podolskiy at vwi-stuttgart.de
Tue Aug 7 13:39:49 BST 2012
Hi,
On 04.08.2012 07:01, Martijn van Exel wrote:
> I produced this small osmchange file using a recently compiled
> osmconvert: https://gist.github.com/3254628
> the osmosis --rxc task chokes on it though: https://gist.github.com/3254632
>
> Ideas?
yes, the cause of this is pretty simple. Osmosis requires lat and lon
attributes on nodes, and these attributes are not set in the change file
on the deleted nodes.
While the cause is pretty simple, the fix for this (_if_ we want to fix
this) isn't going to be simple. The abscence of the lat/lon attributes
on the deleted nodes is conceptually OK since it is going to be ignored
anyway. However, the <node> parser (NodeElementProcessor to be precise)
does not know about the context of the <node> element - and the
osmChange <delete> is the only case where lat/lon is actually optional.
Also, the Osmosis data model (the Node class) has an implicit assumption
that the nodes always have lat/lon - the fields are typed as double and
can't thus be set to null, and there is no such thing like
Node.isPositionDefined().
There also was a recent related API change which makes lat/lon optional
for already deleted (visible="false") nodes, so now seems a good point
to talk about that too.
So what can be done about this:
1. Make the lat/lon attributes optional and change the data model
accordingly.
2. Make the lat/lon attributes optional everywhere and don't change the
data model.
3. Teach the NodeElementProcessor about context it is called from and
make it generate some bogus and/or sentinel values for the Node classes
in that case (0, Double.NaN, Double.MAX_VALUE, whatever).
4. Ignore the whole issue and just say that lat/lon is required.
My personal and also very humble opinion about those:
1.: For this, we would need to change all the downstream tasks to check
if the position is actually there. Given that the correct absence of
lat/lon is a quite rare case, this seems rather costly.
2. and 3.: These are actually almost the same thing. If only the parser
thinks lat/lon are optional and the downstream tasks think it's always
there, the parser needs to put some value in the node for
Node.getLatitude() to return.
The only difference between 2 and 3 is the amount of validation done in
the parser (and therefore the assumptions the downstream tasks can have
about the data). I think it's better validate in the parser and fail
fast. Yes, bringing context to NodeElementProcessor is a bit ugly - but
hey, it's deep down in the implementation classes and nobody sees it. If
the input is actually broken (normal OSM XML with lon/lat not set), it's
better to fail in the parser than pretend that it's OK leading to
undefined behavior downstream.
4.: Well, we _could_ ignore osmconvert producing those files (no offense
intended, Markus) and argue that it's a bug in osmconvert and it should
be fixed on that side and so on. But I don't think it is wise to ignore
the API... So this is not an option, despite having a very attractive
implementation cost of zero :)
To sum it up, I tend to prefer 3.
Other thoughts on this point, anyone? :)
Greetings from Stuttgart
Igor
More information about the osmosis-dev
mailing list