[OSM-talk-be] Addresses in Belgium
Jo
winfixit at gmail.com
Sat Dec 22 23:53:16 UTC 2012
2012/12/22 Sander Deryckere <sanderd17 at gmail.com>
> After the articles about the addresses in Germany (where every average
> German contributor should gather 1000 addresses), I wondered how we were
> doing in Belgium. So I downloaded the address information in Belgium and
> did some counting.
>
> I only counted the addr:housenumber and addr:street tags. I didn't occupy
> myself with the associatedstreet relations, as I've not that often seen
> those on their own. If you think I'm wrong, I can do my counting again. I
> also didn't take the addr:interpolation into account. Just because it's
> difficult to analyse it.
>
>
>
> The first thing you notice is that there are a lot of features with
> housenumber information, but without street information. While other
> information (such as city) can be determined from closed boundaries. It's
> often ambiguous and hard to determine the street from other OSM features.
>
> Some applications (s.a. Nominatim) implement the "street guessing",
> sometimes with wrong results. Other apps (s.a. OsmAnd) just don't include
> houses without street information in their search, so that info is
> completely lost.
>
> Now, how many addresses would be missing. We can't assume Belgium has 11
> million addresses, as many people live together. So I searched other data.
> The number of addresses in Belgium seems impossible to find, but I did find
> the number of families in Belgium:
> http://www.centrumvoorsociaalbeleid.be/indicatoren/index.php?q=node/176.
> I assume that the number of addresses must be about the same. There are
> addresses without families (like firms) and multiple families living in one
> apartment with one address (but often different post boxes).
>
> So that means we're needing about 4.5 million addresses and currently have
> 112 000. The completeness is thus about 2.5% of address data. Not a very
> good number. When searching 40 addresses, only 1 on average will be found
> in OSM.
>
> But it becomes better when we look how the data evolved. Of that 112 000
> addresses, there are 76 000 created (or modified) in 2012 and 105 000 since
> 2011. That means that the number of addresses created is going up If we
> can keep a bit of growth, we could map the majority of addresses in a few
> years.
>
> So continue with the effort, and map as many addresses as possible.
>
> *How to map?*
>
> There's also a lot of armchair mapping that can be done. First of all, all
> that streetnames that need to be added. People are better in guessing the
> right streetname, and if there's doubt, just add a fixme tag.
>
> Next to that, if you see a restaurant on the map, without address data,
> just search the website of that restaurant and get the address data from
> there. You're doing nothing wrong, as long as you don't take the data from
> a database (such as the golden pages), you aren't violating any copyrights
> or database rights. While you're at the website of the restaurant, you can
> also add other information s.a. opening hours or phone number.
>
> Of course, when the weather is good, you can go out and map addresses. I
> normally use photo mapping because it's so fast (and if you see an other
> feature, you can also just take a picture of it), but there are also apps
> for that, s.a. the Keypadmapper app for Android.
>
I was curious about some actual statistics, so I used Sander's Overpass
query (somewhat modified to include asscociatedStreet relations):
area[name="Belgiƫ - Belgique - Belgien"];
> (
> node(area);
> <;
> ) -> .allnodeswaysrelationsinBelgium;
> (
> rel.allnodeswaysrelationsinBelgium["type"="associatedStreet"];
> ) -> .allassociatedStreetrelations;
> (
> rel.allnodeswaysrelationsinBelgium["addr:housenumber"];
> ) -> .alladdr_housenumberrelations;
> (
> way.allnodeswaysrelationsinBelgium["addr:housenumber"];
> ) -> .alladdr_housenumberways;
> (
> node.allnodeswaysrelationsinBelgium["addr:housenumber"];
> ) -> .alladdr_housenumbernodes;
> (
> .allassociatedStreetrelations;
> .allassociatedStreetrelations >;
> .alladdr_housenumberrelations;
> .alladdr_housenumberrelations >;
> .alladdr_housenumberways;
> .alladdr_housenumberways >;
> .alladdr_housenumbernodes;
> );
> out meta qt;
>
Then some Python magic with SAX to extract the users from the xml:
import xml.sax
>
> ''' sample data:
> <node id="1248089374" lat="51.2509015" lon="5.5448618" version="1"
> timestamp="2011-04-17T09:51:55Z" changeset="7884790" uid="9176"
> user="Maarten Deen">
> <tag k="addr:city" v="Hamont"/>
> <tag k="addr:country" v="BE"/>
> <tag k="addr:housenumber" v="32"/>
> <tag k="addr:postcode" v="3930"/>
> <tag k="addr:street" v="Stad"/>
> <tag k="amenity" v="bank"/>
> <tag k="atm" v="yes"/>
> <tag k="name" v="BNP Paribas Fortis"/>
> </node>
> <way id="165086472" version="1" timestamp="2012-05-26T21:27:01Z"
> changeset="11710662" uid="436365" user="escada">
> <nd ref="1766686928"/>
> <nd ref="1766686927"/>
> <nd ref="1766686912"/>
> <nd ref="1766686914"/>
> <nd ref="1766686928"/>
> <tag k="addr:housenumber" v="1"/>
> <tag k="addr:street" v="Route de l'Arboretum"/>
> <tag k="building" v="yes"/>
> <tag k="source" v="bing2012"/>
> </way>
> '''
>
> class OSMContentHandler(xml.sax.ContentHandler):
> def __init__(self):
> xml.sax.ContentHandler.__init__(self)
> self.noteattributes = {}
> self.tags = {}
> self.users = {}
>
> def startElement(self, tagname, attrs):
> if tagname == "tag":
> self.tags[attrs.getValue("k")] = attrs.getValue("v")
> elif tagname in ["node", "way", "relation"]:
> self.noteattributes = attrs
> self.tags = {}
>
> def endElement(self, tagname):
> if tagname in ["node", "way", "relation"]:
> if 'addr:housenumber' in self.tags:
> if self.noteattributes['user'] in self.users:
> self.users[self.noteattributes['user']] += 1
> else:
> self.users[self.noteattributes['user']] = 1
> def endDocument(self):
> with open("C:/data/usercount.csv", mode='w', encoding='utf-8') as
> csvfile:
> for key in self.users:
> csvfile.write(key + ';' + str(self.users[key]) + '\r\n')
>
> def main(sourceFileName):
> source = open(sourceFileName, encoding='utf-8')
> xml.sax.parse(source, OSMContentHandler())
>
> if __name__ == "__main__":
> main('C:/Data/OSM/addressesInBelgium.osm')
>
As usual with OSM data, this only takes the last user who modified the
object into account, so it's a bit skewed, given that it's often the first
user who did the hardest work... But digging in the history of those
elements to find out when exactly the house number was added would lead a
bit too far.
And the winner is:
TAA 35216 Amazing! escada 15131 Impressive zors1843 9293 Keep it up
Polyglot 7336
Sanderd17 5515
Patrick Bous 3600
lodde1949 3070
ivodeb 2705
eMerzh 2549
GuyVV 1864
Filip 1844
orioane 1615
meet00 1422
rob72 1394
Scapor 1102
D!zzy 873
Runner42 707
susvhv 703
Philippe D 679
Eimai 654
raskas 628
pweemeeuw 458
Sebke 442
OlivierCO 427
Renaud Michel 426
Kurt Roeckx 403
CyFo 369
Ben Abelshausen 340
xybot 334
moyogo 318
jan/42 305
ivom 293
SPQRobin 284
jvh 253
martino260 253
Jorieke V 250
toeklk 234
linusable 223
JDub 196
cimm 194
fmarchal 183
Tbj 176
Victor LP 173
jbwi 157
JoBru 155
rtega 150
M!dgard 145
Jacques Lys 141
boskoyevsky 140
Oli-Wan 139
Greggo 138
btrs 134
Loll78 131
Toi 131
Piwi 131
Maarten Deen 129
QuercE 122
Glenn Plas 111
Med 110
Eric Goris 109
DJ_frombelgium 107
Lithion 106
Jowielen 104
Skratz 104
guidostraat 102
188 contributors added at least 10 addresses
508 added at least 2
917 added at least 1 address in Belgium
Polyglot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-be/attachments/20121223/298cff91/attachment.htm>
More information about the Talk-be
mailing list