[OSM-dev] Data checks?

Frederik Ramm frederik at remote.org
Sat Feb 16 22:12:21 GMT 2008


> > What has already been done on data checks, and is anyone else working on 
> > it? What would be some good ideas to start with, and goals to work toward?

> I've coded a script to parse planet and return a list of all the 
> excessively long ways (data is available from 
> http://openstreetmap.ca/statistics/) which I am currently manually 
> correcting (in this case the excessively large is defined as more then 
> 500 nodes). Kleptog has the coastline error checker which has been 
> extremely helpful as well, and I believe has been used to fix the little 
> coast errors that the checker turns up in some areas.

One thing that isn't been done at them moment is referential
integrity checks. Theoretically there must not be any ways or
relations referring to deleted or non-existent other objects, but
every now and then one turns up (people report strange errors, you
check it and find there's an inconsistency).

I think there's a catch with that however - the planet file generation
process may introduce artificial inconsistencies because if someone
uploads a new way plus nodes halfway through the export, then the way
is in the file and the nodes aren't. So if you do a check based on the
planet file, you'll have to take that into account, either by ignoring
problems with objects created in the hours before the file was
completed, or by cross-checking any problems found with "real" data
from the API.


Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00.09' E008°23.33'

More information about the dev mailing list