[OSM-talk] Importing third party data

Lester Caine lester at lsces.co.uk
Wed Sep 19 07:36:03 BST 2012


Lucas Nussbaum wrote:
> Alternatively, if this was software development, what should probably be
> done is:
> 1. commit the raw conversion for the vectorized cadastre, before the
>     cleanup
> 2. clean up and upload modified buildings after the cleanup
> 3. add roads, etc. and upload

With the growing volume of data that is becoming available under licences that 
allow us to use them as sources, it IS important that we have a well documented 
process on how this data can be used and that it's importation is properly 
managed. A lot of 'data' is only available as raster images, such as the 
satellite imagery or maps which can be used as trace sources in the various 
editors. Some contributors do an excellent job of adding these sources as 
properly scaled and geo-referenced layers, and for many of us the increasing 
availability of historic maps fully geo-referenced is doing away with the need 
to 'import' information into the main database - which is a pity.

Other data sets area available either as text data or as full vector data. The 
text data is currently handled on something of a piecemeal basis. Things like 'a 
list of places of worship' is relatively easy to cross reference, but without a 
reliable 'url' for the object ON OSM we can't take as much advantage of that 
data as would be nice.

Vector data is a different mater, and it is the handling of this that we are 
currently 'disagreeing' about? It is the first line above that is the whole 
problem here. I remember the discussion on the Canadian data but I don't 
remember if the data set is now being properly referenced? I have access to a 
layer of data for the UK that we are currently trying to make available as an 
import source. Some people will not like it being imported, but it sounds like 
it is the same layer of data as being added in France. The National Land and 
Property Gazetteer is a complete index of all of the land in the UK and includes 
a complete list of all identified roads. In theory is has vector data relating 
to each object in the data set, but THAT layer is of vastly different quality 
depending on the local authority who are required to gather it. Some do not yet 
add this data, and OSM has a unique opportunity to fill that hole! If only we 
were allowed.

The current holdup is licensing, but hopefully it will become available and that 
is being actively worked on. The PROBLEM is managing the integration of that 
data and more important it's ongoing updates. However the system is DESIGNED to 
be continually updated, and updates are happening 'internally' on a daily basis. 
So update sets are sent out ( and I get them for a number of the local councils 
that I provide services to ) and we apply them to the local copies of the data. 
Linking that process into a 'mass import' into OSM will be easy, but then 
providing the back path of 'updates' to NLPG will be more difficult but 
something which is of mutual benefit. Point 2 above 'clean up and upload 
modified buildings after the cleanup' should provide OFFLINE a means of 
identifying the data that has been updated from the previous upload and ideally 
'update' the existing object in the database. Then we can spend our time ADDING 
data to the database rather than throwing it away every time?

It is the 'update' process which we are currently disagreeing on with the French 
data, and the same discussion happened over the Canadian. Simply wiping existing 
data and loading a new copy is wrong! Now it may be that the data source does 
not provide a proper basis to provide managed updates, and that is where the 
LOCAL groups need to liaise with the sources of the data and help develop a 
process that allows managed updates. Failing that, then we come back to the 
'commit the raw conversion for the vectorized cadastre, before the cleanup' 
which it WOULD be nice to maintain as an overlay layer, but perhaps it simply 
needs to be a vector overlay as with other maps sources.

Point 3 above is simply wrong 'add roads, etc. and upload' ... and this is the 
MAIN point about using an account that is identified as an 'import account' ... 
all of the data added or deleted in that import should relate to the data source!

-- 
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk





More information about the talk mailing list