[Imports] Ethiopia POI Import
Imre Samu
pella.samu at gmail.com
Sun May 8 00:13:07 UTC 2022
Hi Alexander!
Thank you for opening the data! :-)
Quick feedback about AddisMapPOI.osm.bz2
- Probably some issues/comments are related.
- need some data cleaning / re-tagging ..
- and de-duplicates is a key!
my suggestion:
- split/partition the data by POI type ..
- and let's first focus on the more stable POI types
-- Diplomatic, Banks, College, ...
details:
1.) Addresses: is it possible to clean the tags? most of them are not
documented in the osm wiki; so not usable for most of the programs.
$ osmium tags-count AddisMapPOI.osm | grep addr
33260 "addr:subcity"
33088 "addr:woreda"
29356 "addr:description"
24033 "addr:housenumber"
21120 "addr:street"
176 "addr:city"
11 "addr:postcode"
10 "addr:kebele"
7 "addr:Kebele"
7 "addr:housename"
5 "addr:buildingname"
4 "addr:p.o.box"
2 "addr:floor"
2 "addr:full"
2 "addr:wereda"
1 "addr:country"
1 "addr:mobilephone"
based on https://en.wikipedia.org/wiki/Subdivisions_of_Ethiopia
- "woredas" (districts) ---> addr:district ?
- "kebele" (wards) ---> addr:ward ?
2.) I have found strange house numbers; Is it valid numbers?
$ osmium tags-count AddisMapPOI.osm '*housenumber=*' | head
3845 "addr:housenumber" "New"
655 "addr:housenumber" "new"
531 "addr:housenumber" "NA"
451 "addr:housenumber" "AV"
141 "addr:housenumber" "1"
133 "addr:housenumber" "shed"
108 "addr:housenumber" "2"
95 "addr:housenumber" "Shed"
91 "addr:housenumber" "4"
79 "addr:housenumber" "3"
3.) I have found some strange POI names.
$ osmium tags-count AddisMapPOI.osm '*=*losed*' | head -n 30
143 "name" "Closed Sook"
72 "owner" "Closed"
22 "name" "Sook (closed)"
20 "name" "Sook(closed)"
19 "owner" "Closed Sook"
13 "name" "Closed"
12 "name" "(closed)Sook"
9 "name" "(Closed)Sook"
7 "name" "Sook (Closed)"
6 "name" "Sook(Closed)"
5 "owner" "No Information/closed"
4 "name" "Closed Café"
4 "name" "Sook Closed"
3 "name" "Closed Bar"
3 "name" "Closed Grocery"
3 "name" "Closed SOOK"
3 "owner" "(closed)"
2 "name" "Abdela Sook(closed)"
2 "name" "Abeba Sook Closed"
2 "name" "Closed Chat Bet"
2 "name" "Closed Wholesaler"
2 "name" "Kurs Bet Closed"
2 "name" "Pepsi Sook(closed)"
2 "owner" "(Closed)"
2 "owner" "Closed Café"
1 "contact:phone" "+251 Closed"
1 "name" "(Closed Sook)"
1 "name" "(Closed)Grocery"
1 "name" "(closed Chatbet"
1 "name" "(closed) Grocery"
4.) I have found some strange phone numbers:
$ osmium tags-count AddisMapPOI.osm '*=+251*' | head
3441 "contact:phone" "+251 "
20 "contact:phone" "+251 AV"
10 "contact:phone" "+251 NA"
4 "contact:phone" "+251 911"
4 "contact:phone" "+251 N.A"
3 "contact:phone" "+251 115510694"
3 "contact:phone" "+251 115529979"
3 "contact:phone" "+251 11652331"
3 "contact:phone" "+251 116632828"
3 "contact:phone" "+251 9"
5.) osmium tags-count AddisMapPOI.osm 'amenity=embassy'
20 "amenity" "embassy"
As I know the "amenity=embassy" is deprecated, so need to convert the new
tagging
see: https://wiki.openstreetmap.org/wiki/Tag%3Aamenity%3Dembassy
6.) Now the source_ref link is "POI not found" ( ~ 404 ); in this case no
need to import.
cat AddisMapPOI.osm | grep http | head -n2
<tag k='source_ref' v='http://pris.map.et/2014-12-07-133039/pic00787.jpg'
/>
<tag k='source_ref' v='http://pris.map.et/2014-12-07-133039/pic00264.jpg'
/>
7.) The elimination of duplicates - so harder ;
for example - this will be imported:
$ osmium getid --no-progress -f osm AddisMapPOI.osm n228824
<?xml version='1.0' encoding='UTF-8'?>
<osm version="0.6" upload="false" generator="osmium/1.14.0">
<node id="-228824" *lat="9.0469216" lon="38.7619826"*>
<tag k="FIXME" v="Removed ???"/>
<tag k="name" v="Karl-Marx-Monument"/>
* <tag k="name:de" v="Karl-Marx-Denkmal"/>*
<tag k="tourism" v="attraction"/>
</node>
</osm>
but this node already exists: https://www.openstreetmap.org/node/2721720441
<osm version="0.6" generator="CGImap 0.8.6 (2724028
spike-08.openstreetmap.org)" copyright="OpenStreetMap and contributors"
attribution="http://www.openstreetmap.org/copyright" license="
http://opendatacommons.org/licenses/odbl/1-0/">
<node id="2721720441" visible="true" version="2"
changeset="21144484" timestamp="2014-03-16T20:40:10Z" user="Krteček"
uid="272351" *lat="9.0469216" lon="38.7619826"*>
<tag k="historic" v="memorial"/>
<tag k="memorial:type" v="statue"/>
<tag k="name" v="Karl-Marx-Monument"/>
*<tag k="name:de" v="Karl-Marx-Denkmal"/>*
</node>
</osm>
8.) duplicated hotel example:
import candidate: Foyat Hotel
$ osmium getid --no-progress -f osm AddisMapPOI.osm n302338
<?xml version='1.0' encoding='UTF-8'?>
<osm version="0.6" upload="false" generator="osmium/1.14.0">
<node id="-302338" lat="8.991262" lon="38.794178">
<tag k="contact:email" v="info at foyathotel.com"/>
<tag k="mobile" v="+251 966 21 54 32, +251 966 21 54 33, +251 911 52 51
74"/>
*<tag k="name" v="Foyat Hotel"/>*
<tag k="phone" v="+251 11 660 70 96"/>
<tag k="tourism" v="hotel"/>
</node>
</osm>
in the OSM database already duplicated (2x ) :
https://www.openstreetmap.org/node/4317313595
lat="8.9912752" lon="38.7942496" "name="FOYAT HOTEL"
https://www.openstreetmap.org/node/4298833201
lat="8.9913672" lon="38.7942190" "name="Foyat Hotel"
Regards,
Imre
Alex (AddisMap.com) <alex at addismap.com> ezt írta (időpont: 2022. máj. 5.,
Cs, 20:33):
> Hi,
>
>
>> For start, publish this dataset to allow judge its quality.
>>
>
> sure, http://download.addismap.com/poi/AddisMapPOI.osm.bz2
>
>
>> I am highly dubious that adding cafes, kiosks and hotels
>> based on up to 12 year old data is a good idea.
>>
>> I am unfamiliar with Ethiopia, but in Europe in such old data
>> very large part of object would be not existing anymore.
>
>
> The question is if the benefit from having that data is higher than not
> having it :-) We are thinking about importing the data in a way it does not
> hurt but can be easily updated with existing OSM apps (for example
> StreetComplete).
>
>
>> Also, you would need to contact local community and ask
>> whether importing this dataset is welcome
>> (I see that you posted to mailing list, see
>> https://wiki.openstreetmap.org/wiki/Contact_channels
>> for listings of other possible communities).
>>
>
> Yes, we have some contacts here and will reach out on different channels.
>
> I would be opposed to this import, but if local community
>> disagrees and wants to add this data - feel free to ignore this.
>>
> okay
>
>
>> Note that imports would require elimination of all duplicates.
>>
>
> okay, sure, yet we are wondering what's an efficient workflow for that
> task. Advice is appreciated :-)
>
> > The check_date would be set to 2016-01-01
>>
>> It should be set to date when object was confirmed to exist.
>>
>
> We don't have the exact date here, just a range. We might be able to
> reconstruct the real check date from other datasets, or maybe can set it to
> 2010 to be on the safe side.
>
> Regards,
> Alexander
>
>
>
> _______________________________________________
> Imports mailing list
> Imports at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/imports
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20220508/81ab3247/attachment-0001.htm>
More information about the Imports
mailing list