[Tagging] Data redundancy with "ref" tag on ways vs relations

Paweł Paprota ppawel at fastmail.fm
Mon Jul 30 18:32:57 BST 2012


> But what leads you to the assumption that the data get's better when we 
> agree to only use ref on relations or only use ref on ways?

Well my logic is simple - less duplication = less data to maintain =
more time mappers can spend on checking the quality of the data.

If I understand your points correctly and follow the logic - introducing
"ref" on nodes would make users triple check the data and consuming
software would have a third source of ref data which would be better.

I think that data duplication is fundamentally wrong as it leads to
counterproductive work like fixing "ref" tags along the 1000 km motorway
which some users are doing right now based on OSMonitor reports...



> While fixing bots aren't a good approach here - you don't know for sure 
> if the relation or the way are correct - this is the way most of the 
> current QA tools work now: use heuristics or validity checks to guess 
> where errors might be, and most of these tools are welcome and (some) 
> mappers sometimes look into it to hunt down data errors.

This is exactly what OSMonitor is NOT trying to do. See introduction
part of https://wiki.openstreetmap.org/wiki/OSMonitor that explains what
I mean. The report does not just list relations from OSM - it list real
roads and based on that finds relation and verifies ways etc. So if the
road is red that means the OSM data is wrong (or OSMonitor has a bug but
of course it doesn't have bugs ;-).

Paweł

On Mon, Jul 30, 2012, at 19:12, Peter Wendorff wrote:
> Am 30.07.2012 18:58, schrieb Paweł Paprota:
> > Hi Peter,
> >
> > I understand what you're saying about ease of use but at the same time I
> > am very concerned about the quality of data - it is clear from reports
> > that there are just so many errors that the ref data is virtually
> > useless for navigation or location purposes.

> 
> I think, this would lead to a situation where the error count doesn't 
> decrease, but the remaining errors aren't detectable any more.
> 
> Having refs only on relations means for a data consumer: I have to use 
> this data and I have no idea if it's correct - I have to assume it is to 
> use it.
> Same for refs only on ways.
> 
> refs on both means: I am free to use this or that - that's not worse 
> than the two other options above; but on top of that I am able to check 
> if both taggings are in conflict, and if so, I e.g. may ask my users 
> what's correct here, and as osm is free for everyone, as long as that 
> one agrees to the contributor terms and license, it's very welcome that 
> errors are fixed or reported by these consumers or their users.
> > I feel like there is no clear contract between the data and the
> > consuming software - some people use "ref" on ways, some people add
> > relations (this is preferred now as I see from remapping efforts). I see
> > two ways to "fix" it:
> >
> > * Invest time in QA - like reporting, auto fixing bots etc. so that the
> > relations and refs on ways are synced.

> > * Choose one way (relations is clear "winner" here), invest time into
> > making consuming software support this way and clearly encourage it.
> -1 as described above: we lose the possibility to check for clear bugs.
> 
> regards
> Peter
> 
> _______________________________________________
> Tagging mailing list
> Tagging at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/tagging



More information about the Tagging mailing list