[OSM-dev] CHANGE DETECTION IN OSM GEOMETRY

Sandor Seres sandors39 at gmail.com
Fri Nov 8 11:00:47 UTC 2013


I am not sure that I properly understand whether and how change detection
happens in OSM source data and related applications (mapping systems like
Slippymap layers, tiling services and so on) . I was through many related
articles in OSM WIKI but the help was moderate. Yet, as I understand,
people/editors from around the World are uploading their changes to the OSM
database after their best ability. How to use these (large volume and
frequent) changes is up to the different mapping system applications or data
service providers. To help application users to rationally update their
databases OSM provides downloadable regular dumps and change packages called
"diffs". Diffs contain edits/changes between the "last time" and "now"
within different time spans.

If my understanding of the mentioned semantics (time stamp, time-ID) based
change detection is close to reality then I dare to mention several
drawbacks of such detection:

-Diffs are aggregated edits from "last time" until "now". To detect changes
a mapping system should compare the objects in "diff" to the "last time"
objects. As a rule, this is possible only after extensive transformations of
the objects in "diff". For these transformations the whole current data set
could be required.

-Edits in "diff" are related to whole objects (or features as you probably
call them) although only small portions of the objects are changed. Some
editors do similar updates by inserting objects into other object layers
(for example small virtual lakes to refine a coastline section);

-Geometry and topology errors are never detected (replicated and almost
replicated poly-lines, border polygons, replicated nodes, missing
container/outer border polygons, missing hole/inner border polygons,
overlapping or partly overlapping objects from different classes like
lakes/rivers, lakes/Planet-sea and so on). These errors are accumulated over
the time and there is a real risk that these will be present permanently.
Already there is a huge amount of them and present in any publicly available
OSM based mapping systems today.

               Geometry based change detection could be much more robust and
increase the efficiency and reliability of the database updates (only by
necessary geometry changes). One possible option is described shortly by the
following bullets. 

Assume we have a hybrid data set (format) for any data layers from
"last-time" and from "now" (old and new data). The hybrid format is
integration of the tiled vector data and of a bi-tonal highly compressed
(tile) position image. Any "black" pixel directly refers to the
corresponding tile. We keep only none trivial tiles (so, for area object
layers tiles from the interiors are ignored, or pre-defined by the format).
The projection used is Mercator; the tiles are quadratic with 2000 m edge
size. The hybrid format is considerable smaller compared to the input (none
tiled) format and is created quickly in a multi-tiling procedure. For
example, for the Planet-land data layer the hybrid format is created in
several minutes on a laptop. Further on we refer to Planet-land data layer
as example.

Comparing the old and new hybrid data sets we can, practically in no time,
detect the geometry changes related information and necessary data for:

-Tiles exclusively present in the "last time"/old data set. These are tiles
to be deleted from the database. The access indices are provided either
on-the-fly or packed into a list;

-Tiles exclusively present in the "now"/new data set. These are tiles for
insertion into the database. The insertion could be done, again, on-the-fly
or from a package/list of new tiles; and

-Tiles with different contents present in the same position in both the old
and new datasets. These are tiles to be replaced in the database. The access
indices are provided directly in pairs or in delete/insert update lists
format. 

It is worth to mention that besides the geometry change detection the model
provides an excellent error detection visual inspection model. Overlapping
the new and old tiled data for an object layer, with a proper colour
selection (contrast), a quick visual inspection shows immediately suspicious
cases. For example, in case of Planet-land data layer within 20 minutes we
have detected over 100 errors (systematic, probably permanently) present in
any OSM based mapping systems.

               The described (geometry) change detection model is just one
option but from a living practice. It may provide some hints to developers
of OSM based mapping systems, especially if their systems are using
databases.  If interested, more details, examples and illustrations may be
found in this white paper:

https://drive.google.com/file/d/0B6qGm3k2qWHqSnotbW1QUThBa1E/edit?usp=sharin
g

 

Regards, Sandor 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20131108/a2237c5e/attachment.html>


More information about the dev mailing list