[talk-au] Victorian Vicmap Address Import Proposal

Andrew Davidson theswavu at gmail.com
Thu May 27 10:39:52 UTC 2021


On 25/5/21 4:41 pm, Daniel O'Connor wrote:
> I'd make a polite argument there is still value in at least the suburb, 
> possibly postcode being still provided.  When exporting data via 
> overpass as CSV; it's not currently easy or obvious to appropriately 
> bring in the parent attributes; even if it is for a Real Human looking 
> at the map.
> There's a fair number of use cases for "data in a spreadsheet 
> friendly format" I feel.

You don't need to add addr:suburb to get that, all you need is a little 
Python.

Assuming you have a csv dump of the address points from OSM eg:

@type, at id, at lat, at lon,addr:unit,addr:housenumber,addr:street
node,34495141,-35.2641690,149.1223146,,3,Sargood Street
node,40293773,-35.2640376,149.1226107,,9,Sargood Street
node,254020381,-35.2623407,149.1451050,1,5,Edgar Street
node,291548764,-35.3847749,149.0720245,,56,Mannheim Street
node,318854867,-35.3339561,149.1697838,,289,Canberra Avenue
node,318855426,-35.3244730,149.1792480,4,59-61,Wollongong Street
node,318856277,-35.3150098,149.1417359,,19,Jardine Street
node,318859652,-35.3627241,149.0815960,,70,Hodgson Crescent
node,318859688,-35.3627835,149.0817144,,70,Hodgson Crescent
.
.
.


and you've the corresponding admin_level 10 and post code boundaries in 
geojson:

act_suburbs.geojson
postcodes.geojson

then you import the libraries you need:

import pandas as pd
import geopandas as gpd

read in the address points:

addlist= pd.read_csv('act_address_dump.csv',low_memory=False)

convert the list to a geoframe:

address_points = 
gpd.GeoDataFrame(addlist,crs="EPSG:4326",geometry=gpd.points_from_xy(addlist['@lon'],addlist['@lat'])) 


read in the suburb boundaries:

suburbs = gpd.read_file('act_suburbs.geojson')

drop all of the tags that we will not need:

suburbs = suburbs[['name','geometry']]

then do the same for the post code boundaries:

postcodes = gpd.read_file('postcodes.geojson')
postcodes = postcodes[['postal_code','geometry']]

now we merge the three data sets together with a series of spatial 
joins. First the suburb names:

address_points = gpd.sjoin(address_points,suburbs,op="within")

the join creates a column we don't need so get rid of that:

address_points = address_points.drop(['index_right'], axis=1)

then join the post codes:

address_points = gpd.sjoin(address_points,postcodes,op="within")

we've now got all of the data into the one frame but we need to clean up 
the column labels before we write it out, so do a rename:

address_points = 
address_points.rename(columns={"name":"addr:suburb","postal_code":"addr:postcode"})

and we can then write out the columns we want to a csv file:

address_points[['@type','@id','@lat','@lon','addr:unit','addr:housenumber','addr:street','addr:suburb',
'addr:postcode']].to_csv('act_out.csv')

which gives you:

, at type, at id, at lat, at lon,addr:unit,addr:housenumber,addr:street,addr:suburb,addr:postcode
310,node,2441363738,-35.3076927,149.1333269,,7,National Circuit,Barton,2600
2280,way,564187362,-35.1539837,149.1117804,,5,Jimmy Little 
Street,Moncrieff,2914
4414,way,823380125,-35.2242021,149.0456133,,55,Ennor Crescent,Florey,2615
2249,way,547120674,-35.2540932,149.1531645,,24,Piper Street,Ainslie,2602
1548,way,220316259,-35.3349388,149.0923894,,27,Coxen Street,Hughes,2605
4511,way,847394981,-35.2353182,149.0470223,,2,Diggles Street,Page,2614
3747,way,796706631,-35.2288001,149.0513507,,4,Caddy Place,Florey,2615
555,node,4214686496,-35.318041,149.1264149,,39,Empire Circuit,Forrest,2603
3280,way,776943661,-35.4468204,149.1164925,,8,Mackerras 
Crescent,Theodore,2905
1052,node,7930404220,-35.1705767,149.0708312,,13,Gladstone Street,Hall,2618

I did this in an interactive ipython session, but if this is something 
people want it could be easily turned into a Python script that does the 
pull from overpass and writes out the file.

I did the whole country in one go to see how well it scales and the run 
time was pretty much the same. Of course you can't do postcodes for 
everywhere as we have put them all in yet.








More information about the Talk-au mailing list