[Imports] Proposed import of South Australian waterbodies data
Henry Haselgrove
haselgrove at gmail.com
Sun Feb 8 23:59:51 UTC 2015
From: Paul Norman [mailto:penorman at mac.com]
Sent: Monday, 26 January 2015 11:49 PM
> You are proposing to use bulk_upload.py. Have you tested it on the dev servers? It is known for being tricky. You will
> also need to make sure that your changesets are a reasonable size, probably under 10 000.
I did try uploading an earlier version of the file to the dev servers. It took a few hours in total. I needed to restart the script a few times, either because it stopped with a network error or became unresponsive. Each time I restarted, the upload appeared to successfully resume. The changeset size is hard-coded to 50000 in the script, but based on your comment I will modify the source code to reduce it to 10000. Perhaps I should spend some time investigating why bulk_upload.py is unreliable, with the aim of improving it for everybody.
> Some features have a name of (disused) and appear disused
> I selected random dams, and about 30-40% didn't have any sign of water on Bing. I'm aware of the cyclical nature of
> water features in Australia, but several of them showed no signs of water nor did they seem like a likely location to have > a new dam built.
South Australia is the driest state in Australia, so most dams in the state would be completely dry for some part of the year. Dams typically aren’t located on a river or stream, or have a wall. I looked at a random subset of 100 of the dams in Bing. Of those, 53 showed clear signs of water. 42 looked quite clearly like dams that happened to not have water at the time, presumably due to dry weather. In five cases, I couldn’t see an obvious sign of a dam. Perhaps in some cases the dam was too shallow to be visible (and empty at the time). However it does suggest the possibility that some small proportion of the features are wrong.
> Other features did not have a great accuracy rate, although it is harder to tell wetlands from the air
I agree that there is a proportion of the lakes and wetland features that, like some of the dams, don’t make sense when comparing to the Bing maps. My overall sense from looking at Bing and from comparing with areas that I am familiar with is that the accuracy is of the dataset is good in general. I am encouraged by the fact that it is an authoritative source.
> Many features are badly overnoded (e.g. the water at -34.21985 140.35764). A simplify with a 2m threshold in JOSM
> brought the number of nodes down ten-fold for some features.
Based on your comments I have now run the SimplifyArea plugin on the entire data (using the default parameters). This has reduced the total node count in the file to be uploaded by 23%.
> Best practice would not be to include datasa:FEATURECOD and datasa:OBJECTID
Based on your comment, I have removed these tags.
> Opinion is divided on if source is necessary with a source tag on the changeset
> What kind of plans are there for updates?
The source dataset doesn’t appear to get updated very often. (The latest version is dated July 2014). If updated versions of the source data are made available in the future, I will perform the processing steps again, to import new features. However I am not planning to handle features that change shape, change tags, or are deleted.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20150208/39b93dac/attachment.html>
More information about the Imports
mailing list