[talk-au] Import vs filtering query
mapslittle at gmail.com
Sat Sep 4 10:51:30 UTC 2021
Hi all, my understanding is that the process described below is a big filtering exercise rather than a data import, but since I’ve never been involved in an import before, I’d like to check before progressing. Thanks in advance for your feedback.
Goal: to update road surface tags across regional Victoria where necessary. Many surface tags were added 8-10 years ago and a surprising number of roads have been surfaced since then. (I’m only interested in sealed/paved vs unsealed/unpaved options, not subsets of these.)
Method: compare road surface data in OSM against data in the Vic government’s transport dataset which we have permission and waiver to use. All rural roads from motorways to unclassified (not residential, service, etc) that have different tags in OSM and the gov dataset will be examined against satellite imagery and Mapillary, and any decisions on whether to update the surface tags will be made based solely on the imagery. No data will be directly copied from the gov dataset. Hence, as I understand osm’s import guidelines, this is a big filtering exercise rather than an import. Is that a correct interpretation? I’ve added a longer explanation below to help answer any questions.
Basic assumptions: (1) I assume both datasets were made independently, as I’ve not seen any evidence that OSM surface tags were copied from the Vic data (or that the gov copied from OSM). (2) If the 2 independent datasets both indicate the same surface then I assume it is most likely to be correct. If they indicate different surfaces then one must be in error. At the outset, I have no idea how accurate the Vic gov dataset is, so I’m not assuming it is infallible (it’s definitely not; see comment below).
Methods: for every road segment that has a different surface tag in the 2 datasets, I’d inspect the road using available imagery, as is normally done when adding or updating a surface tag. Existing OSM tags will either be altered or retained, as required. There’s no ambiguity involved in updating a tag from unpaved to paved. It’s much less common to need to update a tag from paved to unpaved. Again, this will be done based on imagery, regardless of what the Vic data says.
Some prelim observations: I’ve trialled the method in NW Vic, where the method works fine on longer road segments/ways. The approach would have to be restricted to ways > 1-2 km long, and short ways will be ignored. From an initial subset of about 50 roads > 5 km long in NW Vic, I found about 2/3 of the discrepancies between the 2 datasets did not warrant any change in OSM and about 1/3 did. The Vic gov data doesn’t seem to be as up-to-date as the imagery and isn’t by any means perfect. Regardless, the approach looks to be a very effective way to find out-of-date and inaccurate road surface data across the state.
At this stage I don’t know how many ways will be examined or changed, as it will depend on the minimal length of road I inspect. I’m envisaging about 1000 at the max, and probably fewer.
My guess is that, if the process was completed across Vic, then the surface data in OSM would be extremely accurate, and more accurate than in the Vic gov database. If I get through enough of it without going bonkers, I’m interesting in summarising the findings to show which discrepancies were most common, etc.
So, back to the original question, is this process ok to pursue, given that the sole function of the gov dataset is to provide a filtering mechanism to identify roads to investigate, and all decisions will be made based on legally available imagery, not the gov data?
Thanks very much for your feedback, Ian
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Talk-au