[Tagging] Data redundancy with "ref" tag on ways vs relations
"Petr Morávek [Xificurk]"
xificurk at gmail.com
Mon Jul 30 20:57:58 BST 2012
Tobias Knerr wrote:
> If two instances are created at least somewhat independently*
This is a really bold assumption. I'm having a hard time to imagine a
real-life scenario, where this is true.
On the other hand, I can imagine scenarios where the cross-check will
fail simply, because someone who edited way, forgot to edit the relation
as well and vice versa.
> However, at this point we can begin to use automated error checking. The
> idea is that errors that can be found automatically are much more
> acceptable than those that cannot.
> With only one instance of the data, none of the errors can found
You can spot a lot of errors just by doing a simple analysis of the
route graph - Are individual segments continuous? Is the resulting route
a simple linear feature? ...Yes, it's not 100% accurate, but the
alternative (data duplication + cross-checks) is neither.
By this you can catch most of the important errors and don't have to
rely on duplicated data.
I think it's better to spend some time in developing more sophisticated
QA tools, then to waste it on data duplication.
Actually, we have talked about this issue in talk-cz (Czech Republic)
recently. One guy made a simple analysis tool for finding "holes" in our
road network left by the redaction bot - the tools simply collected all
ways with e.g. highway=primary+ref=## and run some checks on them.
Consequently, the question why do we add the ref tag to every single way
was raised and that it would be a good idea to move it to some parent
relation. AFAIK, we don't use (m)any route relations in our road network
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 262 bytes
Desc: OpenPGP digital signature
More information about the Tagging