[Imports] Importing addresses in New South Wales, Australia

Dion Moult dion at thinkmoult.com
Sun Jul 1 08:07:49 UTC 2018


So I've begun with my first batch of address imports in Epping, NSW. The area I focused on is shown in the picture below:

https://gitlab.com/dionmoult/osm-nsw-address-import/blob/master/img/epping-scope.png

It is the area bounded by Carlingford Road, Blaxland Road, and the Epping suburb boundary. The screenshot is in JOSM with the LPI NSW Base Map. I drew a way clicking everywhere I saw an address in the LPI NSW Base Map. I made the following mapping decisions:

 1. If there is existing address information, I am not going to overwrite it and so I don't add a node.
 2. If there is an existing building on the lot but no address information, I will add a node. However, I will not merge the address tags onto the building. This will be a job for another mapper.
 3. I only add a node if it is absolutely unambiguous from the LPI NSW Base Map that there is a clear property boundary and a number shown that there is a unique address there to the best ability of my human brain. If there is anything uncertain, I don't add a node. The purpose of this is to get 99% of the way there, and minimise false positives.

I feel these decisions minimise the risk of the import and keep the import simple. There should be no duplication of data and no overwriting of data. After running the script, I created this result:

https://gitlab.com/dionmoult/osm-nsw-address-import/blob/master/img/epping-results.png

You can see that the address nodes very neatly line up and are at the exact centroid of the property. The data set is roughly 1600 addresses being imported. Along the way I improved the script a bit to accomodate some errors from the server, parallelize the requests a bit, and added some edge cases which I noticed (A road named "The Boulevarde" will return as a Null road type, even though "Boulevard" is listed in their appendix). To process these 1600 addresses took about half an hour of human time.

I've saved out the output to this link:

https://gitlab.com/dionmoult/osm-nsw-address-import/blob/master/review/EPPING-1.osm

Please download the .osm file and review it in JOSM. I have not uploaded this to OSM as a changeset and will not until I get another pair of eyeballs over it :) I look forward to any comments and if there are any red flags! If everything goes smoothly I will upload this, and do a larger area.

Here is the link to the repo for anybody who wants to contribute: https://gitlab.com/dionmoult/osm-nsw-address-import/ (because Github is proprietary).

​Dion Moult​

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On June 27, 2018 8:04 AM, Dion Moult <dion at thinkmoult.com> wrote:

> Thanks for the response Andrew! Absolutely, sometimes government information is wrong, sometimes we represent it in a better way, etc: this method allows the mapper to make a judgement on a small case by case of exactly what to do.
> 
> Probably something I didn't explicitly mention before, but of course as I add address information I plan to do it slowly and post it on talk-au for regular review and local QA.
> 
> If there aren't any issues I will create an import plan on the wiki this weekend, along with some data to review.
> 
> Dion Moult
> 
> -------- Original Message --------
> 
> On 24 Jun 2018, 13:54, Andrew Harvey < andrew.harvey4 at gmail.com> wrote:
> 
> > Overall I'm okay with the approach you've outlined.
> > 
> > > and then upload the changeset tagged as `source:import=NSW LPI Web Services`, this will allow anybody who is checking the history to know how this data was gotten.
> > 
> > +1
> > 
> > > There will be no automatic data overwriting or data conflicts.
> > 
> > That's good! I commented on the talk-au list, but to expand on a few points, OSM is all about mapping what's on the ground. There are many cases where the government supplied address points database conflicts with information on the ground. The on the ground scenario is what should prevail in OSM, in my opinion. If anyone wants a government supplied address points database, OpenAddresess does a great job at collating that.
> > 
> > Attached or at https://snag.gy/3jBX5a.jpg you can see a place where I've mapped out an address different to the NSW LPI Basemap, and different to GNAF, instead reflecting what's on the ground.
> > 
> > There's also differing views of how to map addresses in OSM, eg. on an entrance node, on a building, on a residential land parcel, as an unconnected address node. Additionally addresses can be duplicated, eg. shops or offices with the same address.
> > 
> > My point is, I'm not against importing this data, but I think we should pick and choose what makes sense to bring in, and do it in a way that doesn't scare mappers away from deleting or changing that imported address data when they find it doesn't represent what's on the ground.
> > 
> > On 23 June 2018 at 21:15, Dion Moult <dion at thinkmoult.com> wrote:
> > 
> > > Good morning all!
> > > 
> > > The state of New South Wales (where Sydney is) in Australia currently has very minimal address information. I am looking to speed up the mapping of addresses. These can be simply nodes which have addr:street and addr:housenumber tags at a bare minimum, but of course could be better things like part of a building.
> > > 
> > > Currently, apart from local knowledge, addresses can be mapped using the LPI (Land and Property Information) Base Map that we have explicit permission provided by the NSW government to use.
> > > 
> > > https://wiki.openstreetmap.org/wiki/Contributors#Australia
> > > 
> > > https://wiki.openstreetmap.org/wiki/Attribution/New_South_Wales_Government_Data
> > > 
> > > The map provides raster tiles of property boundaries, along with the housenumber of the property. It is a little ambiguous, however, at intersections which street the address belongs to. A mapper is currently able to create a node, interpret the base map background, and then tag it as necessary.
> > > 
> > > However, we have explicit permission to use all of the LPI web services, and the service API allows us to do two things:
> > > 
> > > 1. Ask the question "At this coordinate, what is the address?"
> > > 
> > > 2. Ask the question "What is the coordinate of this address?"
> > > 
> > > Therefore, by roughly choosing coordinates (JOSM has the Edit->Copy Coordinates feature) that we know is within property boundaries (the boundary is shown in the LPI Base Map in JOSM that we always have turned on anyway), we can use a small script that will query the LPI Web Services API, ask those two questions, and then automatically place nodes at the accurate government centroid of the address. We can then review the results manually, and then upload the changeset tagged as `source:import=NSW LPI Web Services`, this will allow anybody who is checking the history to know how this data was gotten. This speeds up the adding and tagging of nodes as the mapper does not need to manually type in data (which is prone to spelling mistakes, and the housenumber in the raster base map is not a high resolution and can be misread), does not need to interpret ambiguous intersections where street is not so clear, and will improve the quality of the node placement as it will be at the actual centroid of the property, not randomly to the mapper's liking.
> > > 
> > > As you can see I am not proposing a huge data import, so there is no actual data to review. Just a semi-automation of the manual address tagging that we do now anyway. However by some interpretation it could be called an "import" as it is semi-automated node placement based off a government API query. It is up to the individual mapper to manually choose coordinates that they want to tag addresses at - so if the mapper sees that addresses have already been tagged, they will simply not query the LPI services for the address at that point. There will be no automatic data overwriting or data conflicts.
> > > 
> > > I have discussed this on the talk-au mailing list, and the responses seem positive. It's quite a conservative proposal, after all. You can see the thread here:
> > > 
> > > https://lists.openstreetmap.org/pipermail/talk-au/2018-June/011937.html
> > > 
> > > The small script along with sample data is shown here:
> > > 
> > > https://gist.github.com/Moult/5821c74fb792b7afa5d758aebea68e40
> > > 
> > > I did a small test of 17 nodes in this changeset to show how it would work. I manually reviewed it and I know this area based off local knowledge.
> > > 
> > > https://www.openstreetmap.org/changeset/59909707
> > > 
> > > Looking forward to your comments!
> > > 
> > > Dion Moult
> > > 
> > > _______________________________________________
> > > 
> > > Imports mailing list
> > > 
> > > Imports at openstreetmap.org
> > > 
> > > https://lists.openstreetmap.org/listinfo/imports



More information about the Imports mailing list