[Tilesathome] Inconsistent data from OSMXAPI

Brett Henderson brett at bretth.com
Tue Nov 20 10:39:58 GMT 2007


80n wrote:
> On Nov 20, 2007 1:28 AM, Brent Easton <b.easton at exemail.com.au 
> <mailto:b.easton at exemail.com.au>> wrote:
>
>     I am seeing inconsistent data being returned by OSMXAPI when doing
>     lowzooms. Ways are referencing nodes that are not included in
>     download. When you go an view the actual data, that node no longer
>     exists. This seems to indicate that it is some sort of timing
>     issue. The end result is that Batik crashes trying to render the
>     lowzoom tiles.
>
>
> Osmxapi *should* be consistent providing its sources are consistent.  
> I know planet.osm is not guaranteed to be consistent but I think that 
> the planet diffs are supposed to be.  Brett can you confirm that?
Unfortunately it won't be consistent because the production database 
itself doesn't ensure consistency.  Planet.osm has two sources of 
inconsistency, 1. dump timing issues where ways get dumped after nodes 
leading to ways referring to non-existent nodes, and 2. referential 
integrity issues in the db itself.  Osmosis fixes the first one, but not 
the second.

If you want to generate a consistent planet, you should be able to run 
the entire planet through a single bounding box as follows:
osmosis --read-xml file=planet-in.osm --bounding-box 
idTrackerType=BitSet --write-xml planet-out.osm

(I haven't tested the above but it should be okay)

The idTrackerType argument makes the bbox use a BitSet for node tracking 
which is more efficient for large numbers of nodes, the default is an 
idList which is better for small bboxes but will consume approx 32 times 
as much RAM for an entire planet.

Are you applying daily diffs direct to a db or are you creating new 
planets every day?  If you're creating new planets the above option may 
work.  If using osmosis to generate planets you could even add the bbox 
step as a step in the main diff application pipeline which should aid 
performance significantly.

I have no way of fixing consistency issues in a daily diff unfortunately 
other than to create a new planet and fix it.  To workaround the 
production database integrity issues is not simple, the best fix is to 
add referential integrity to the db to avoid these problems in the first 
place but I suspect that won't occur for a while.  I'm working on a 
pgsql schema in the background but not making much progress due to other 
work at the moment.

The biggest contributor to ongoing referential integrity issues is 
potlatch.  The main API attempts to workaround the lack of database 
referential integrity but potlatch doesn't use it and is continually 
creating new ways with deleted nodes.  I believe RichardF is currently 
tearing his hair out trying to fix it ...
>
> I ran a full consistency check on osmxapi just a couple of days ago.  
> At that point it seemed ok.  Brent, can you give me some example way 
> Ids so that I can track down where the problem came from please?
>  
>
>
>     Is this a known issue? I am presuming the planet.osm dump is
>     inconsistent.
>
>     Where should we look at solving this problem :-
>
>     1. Planet.osm dump is inconsistent, fix that?
>
>
> Osmxapi uses planet diff files so fixing planet.osm won't help much.
>
>
>     2. Fix OSMXAPI to not emit nodes in ways that have not been
>     included in the download file?
>
>
> It's intentional behaviour that if the ways references a non-existant 
> node then the node will not be in the output, but the way will still 
> contain the offending <nd> tag.  It didn't seem to make sense to 
> remove it as the client cannot then detect that the way is 
> incomplete.  The client should choose how to process such error 
> situations, although this fact should be mentioned in the docs.
>  
>
>
>
>     3. Have Osmarender ignore undefined nodes? Is this possible with XSLT?
>
>
> It used to be able to deal with missing nodes, but maybe this broke 
> during the 0.4 to 0.5 migration.
Let me know if there's any way I can help.

Brett

PS.  I'm not on the tiles list so cc me if necessary ...





More information about the Tilesathome mailing list