[OSM-talk] stop deleting abandoned railroads

Colin Smale colin.smale at xs4all.nl
Tue Aug 18 06:45:59 UTC 2015


 

On 2015-08-18 02:13, Warin wrote: 

> On 17/08/2015 11:13 PM, Colin Smale wrote: 
> 
>> ...which IMHO is part of the bigger picture of data quality. Quality is not the same as perfection. It is about agreeing things, complying with what has been agreed, the ability to measure the compliance objectively and feedback to help improve the compliance.
> 
> ISO 9000 is a standard for quality .. it means if you produce something .. you will continue to produce that something consistently .. rubbish or not.

Actually it is a standard for Quality Management Systems. It does not
tell you what attributes your product should have - that's between you
and your consumer/customer. OSM doesn't really have any way of assessing
its product against desired attributes. How do you think OSM's product
should be measured for these purposes? At the moment it is very
subjective - "good" is anything which is not considered "bad", and "bad"
means shouted down by a few people on a mailing list and/or vetoed by
the DWG in a sort of Star Chamber process. 

> 'Agreed'? Buy whom? OSM can have new tags introduced by anyone. The reality of this is that tags that get used frequently by a number of mappers get 'recognised'.

Agreed between producer and consumer. Our definition of quality will not
include a limitation to ONLY use certain tags, implying that it is the
consumer's responsibility to ignore arbitrary tags. What are our
consumer's expectations? What (apart from product price) will drive
their decision to use OSM instead of other sources? 

> Tags that get 'approved' by the tagging group get the status=approved thing, those rejected get the status=rejected .. but even the rejected tags get used, some even advocate their use. 
> One can take the attitude that at least these tags have been review by some, compared to tags that are simply added by one person without review. 
> 
> Compliance .. with what? The wiki documented tags? Those can be added by anyone. As there is no scheme/philosophy for OSM .. then you have nothing to comply to that cannot be changed so easily that it is not worth the effort.

Compliance with the agreed "specifications." Once again, we don't have a
good definition of "quality" for OSM data, so we cannot use that to
judge whether data is "good" or "bad", or, put another way, "compliant"
or "non-compliant". 

So what dimensions could we apply to OSM data to assess its quality? I
am just throwing some ideas in the mix here, this is not my "answer". In
all cases please imagine the words "to what extent" at the start of the
sentence. 

Completeness 

* Is the data complete, given its intended scope? For example, do we
have ALL the train stations in the UK? 

* Correctness 

Are there any typos in the tagging? Is a train station not tagged as a
tram stop? 

Is the use of those tags which are documented, in line with the
documentation? 

* Consistency 

Is the tagging consistent, across its intended applicable domain? (I
intend to suggest that it is probably impossible to get tagging
consistent across the whole world, but within a country for example it
should most definitely be achievable) 

* Timeliness 

Is the data still valid today? Or to make it "SMART", how long ago was
the data reviewed? Different things will need different standards here -
some things are obviously more volatile than others. 

* Verifiability 

Did the date come from a suitably licenced source? 

Is the data verifiable by an independent member of the public without
any legal privilege? 

* Consumability 

Is the data represented and made available in a way which facilitates
its use? For example, dates in arbitrary local formats would not be
compliant here. We might not be too happy with tags using non-Latin
characters. The use of XML is good, but it's a shame we don't have even
a basic XSD yet (I am working on this though) 

All this might tell us how the data scores, but it doesn't tell us what
we should consider "good enough". In some cases we can expect to get
close to 100% (e.g. train stations in the UK), but all sorts of factors
will keep the score below 100% in practice (like when a new station
opens, it MAY take a long time to find its way into OSM. In the mean
time we are down to 99.9%). In other cases, we might be ecstatic if 15%
of the data was entered/reviewed in the last 5 years. 

--colin 

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20150818/566c57a4/attachment.html>


More information about the talk mailing list