[OSM-talk] Import guidelines proposal update

Lester Caine lester at lsces.co.uk
Sat Sep 22 10:14:23 BST 2012


Paul Norman wrote:
>> From: Lester Caine [mailto:lester at lsces.co.uk]
>> Sent: Friday, September 21, 2012 11:47 PM
>> To: 'OSM'
>> Subject: Re: [OSM-talk] Import guidelines proposal update
>>
>> Paul Norman wrote:
>>>> From: Lester Caine [mailto:lester at lsces.co.uk]
>>>> Subject: Re: [OSM-talk] Import guidelines proposal update
>>>>
>>>> who last edited an object! ). Where the import HAS nice unique object
>>>> identifiers things are a lot easier, but raw vector data like the
>>>> French import, and I think the Spanish data you are talking about CAN
>>>> still be 'diffed' against earlier imports, and result in perhaps new
>>>> data that can simply be imported, or perhaps an overlay that
>>>> identifies conflicts that need a human eye. Isn't it better to spend
>>>> time working out a GOOD way of using the data going forward rather
>>>> than having to manually merge the whole lot again in a couple of
>>>> years time ... and every couple of years.
>>>
>>> My thoughts on how to handle this for data with persistent unique
>>> identifiers without adding those as tags is to
>>
>> ******
>>> a. Record the correspondence between source ID and temporary
>>> pre-upload negative OSM ID
>>>
>>> b. Record the correspondence between pre-upload negative OSM ID and
>>> OSM ID
>>>
>>> c. Combine for a correspondence between source ID and OSM ID, and save
>>> this
>> ******
>> EXCEPT - that requires ALL the data from the external import to be
>> loaded in order to create the OSM ID which may not be a bad thing? ...
>> BUT Part of the 'preprocessing' before ever uploading the import would
>> be to identify which objects are going to be uploaded and which not, so
>> you need to create an 'id' initially related to the data source? That is
>> providing that the data source is actually identifiable data.
>
> Well, you don't need to create an ID related to the data source - this is
> for the case of data with persistent unique identifiers.
>
> If decisions were made to not upload parts of the data with the first upload
> this could easily be captured with the fact that there is no pre-upload
> negative ID corresponding to a particular source ID.

BUT you may still need to identify the the un-merged data when processing in 
later upload cycles ... see below.

>> What I had not considered up until now is if the data source is simply a
>> raw vector file with version of a paper map, then while the individual
>> lines could be 'imported' the data is almost useless until it has been
>> 'identified'? You may just as well simply trace? But even here all is
>> not lost since one can still pre-process the data and provide the link
>> back as to which lines have been copied and which not. In which case the
>> OSM ID provides additional data back to the source, but I doubt that
>> there is any value simply importing millions of lines segments directly
>> into the main database? This has to be a secondary staging area to
>> handle that data?
>
> Data that is purely vectors with absolutely no information that can be
> turned into OSM tags is basically useless. For the case where objects do not
> have a unique ID you'll have to use spatial matching, likely in PostGIS.
> This may run into problems if the geometry in the source substantially
> changes for the same object on the ground but this is an inherent limitation
> of the lack of persistent IDs.

I beg to differ here, although I did originally think the same!
We need some feedback from users who have access to this type of data, and I am 
wondering based on the comments about the French data if THAT is of this style?
And I am STILL looking at a staging layer anyway!

Raw vector data like this - if it is all that is available - has to be traced 
and tagged, but why shouldn't it be provided as a layer from which line segments 
can be simply selected rather than having to trace them? 
click,click,click,click, close(to join into area), identify. The processed lines 
can then be hidden and you move onto the next set ... your comment on 
'stability' of the coordinates between imports is valid, and needs managing but 
I can see a case for 'tracing' say a street element, tagging it, but NOT 
including it in the later 'import'. You just need to make that information 
persistent without using OSM id's.

-- 
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk



More information about the talk mailing list