[OSM-talk] 'Allowed data'

Lester Caine lester at lsces.co.uk
Thu Dec 5 14:00:43 UTC 2013


As a result of some miss communication my job today was cut short, so I'm back 
home early ... this is a summary of my thoughts while driving home.

The trip home was fun due to a lack of the right data. Had I know that the A429 
was closed I would have taken a different decision early on, but watching how 
OSMAnd on the phone and my tomtom handled the situation was interesting. The 
data relating to road types used for routing needs tidying up in a few places, 
with perhaps an element of local knowledge adding to the simple 
'highway=tertiary' blanked 'less suitable' classification in the routing 
process, but that is just another area for development.

While the project name may include the word 'map' it is now well established 
that currently it is 'data' which is the main target of the project. I've 
deliberately left out the word 'the' there which is a subtlety that needs 
explaining first. OSM is a rapidly growing archive of data of several types and 
a lot of information is available if viewed correctly. 'The Data' is what people 
allow to be viewed via the main API rather than via the history and this is 
personally where I have a problem since data that a road existed from time A to 
time B may well be contained in the changelog, but is not so easily accessible. 
Making that 'The Data' provided by OHM in many cases is simply not the right 
approach since the data is already contained in the main data repository and 
there are no plans to 'delete' the changelog?

I have a growing archive of data providing the 'start_date' for many of the 
roads in the areas I'm interested in, and once time permits I will upload them, 
but while the 'added' date is always automatically logged, there is little 
incentive to add a 'start_date' even when new developments are being added to 
the data. While adding historic data, a date may not be possible, checking back 
on some of the growing number of historic overlays does allow a 'before' date to 
be added, so I would like to request that 'start_date' is automatically 
populated with ad the very least, the current date, but with an option to update 
it based on what is being traced from?

Moving on to data that is less easily 'verified on the ground'. The one thing 
that the data is not is 'relational', but with the growing volume is it not time 
to re-address this area. The current debate is on adding addresses and other 
'spam' to the data. If this adds information like house numbers and postcodes to 
the data, then actually I can live with the random data also added. However, 
I've only added a few house numbers locally here since creating the tags for 
every one is time consuming, and I don't see any advantage in having 'Smallbrook 
Road' add some fifty time in the data! All I need is a tag referencing the road 
(or part of it where the postcode changes) and the volume of data is reduced. 
'Smallbrook Road' will reference all of the higher level links needed. As a 
simple extension to this we can also solve a problem that the routing software 
has where we can add an 'abutting' tag where a premise may have a different 
'postal' address to the best route for accessing the property. In the UK it's 
not uncommon to see 'POSTCODE for satnav' after an address ;)

Moving the other way in relation to information in addition to the house number 
or name, adding things like phone number and website has become accepted, and 
the one good thing with the 'new' front end is that they are made available. 
Perhaps not in a style that is usable as a replacement for google, but at least 
it shows the principle. Since the link is active one can follow on to the site 
which is something we did not have before. However I think I am with others when 
I say that listing all the websites for the PO boxes at a post office located on 
the map is a step too far. It *IS* however a point that if that physical 
location had a website which listed it's customers, then one could follow 
through and see that secondary data? The physical location is a post office - 
nothing more - with a physical address.

There was a suggestion relating to the bitcoin 'spam' that the additional data 
should be handled elsewhere, and certainly a database of 'bitcoin' shops could 
quite easily use a reference to OSM on it's own database. This would just be an 
alternative to a 'directory of businesses' provided by the 'post office' when 
working the other way. What I think I am getting to is the 'payment' tag! Should 
that have any place in our data? Yes it makes searching for 'bitcoin', or 'visa' 
shops easier, but if one has a link to the business, then following that will 
provide the current up to date data and we do not need to clog up the changelog 
with all of that traffic?

We need a roadmap of what a 'complete' set of data looks like, and I can see 
separate RELATIONAL databases provided by others providing at least part of that 
data? Even the 'boundaries' problem could be solved by providing a database of 
physical objects selected from 'The Data' and augmented with virtual lines where 
no physical one exists. I am thinking here that the ways required for those type 
of geometry can be cached easily in a separate database and either updated as 
the underlying ways change, or actually more usefully, where a boundary changes 
in the future, the historic versions are maintained!

I'd like to get back to adding to 'The Data' rather than fighting the 
infrastructure to use it. 'The Map' is an alternative to what came before, but 
it only addresses a small area that was less broken than the bigger 
infrastructure, and so now we just need to refocus from a different angle? One 
can use 'The Data' without 'The Map' but accessing the information on how to is 
still an area that needs fixing.

-- 
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk



More information about the talk mailing list