[Imports] Bulk Import of Hotels

Greg Troxel gdt at ir.bbn.com
Thu Jan 28 12:37:22 GMT 2010


I've been thinking about how to handle maintenance of imports; so far
osm has tended to import once and then just edit.  My suggestion is to
put whatever database primary key you use on the node, and to be able to
look at the node and tell which import was done, which really means the
internal date or whatever identifier of your database extract.  Then, in
a year you can generate a new extract and do some sort of diff where you
find out the nodes that have not changed since the previous import and
are different in your database, and adjust them.

If you have building shapes, that's nice too.

Many things in OSM are just tourism=hotel name=foo operator=bar and
that's it.  But surely you have street addresses and phone numbers and
those would be good too.

As for duplicates, I agree with the sentiment that blindly importing a
db that may have substantial overlap is bad, but if there are a few dups
to be cleaned up I don't think that's so terrible.  So you might, for
each point to be added, look for any node of tourism=hotel within 200m,
and also try to filter on name matching - and then manually look at the
results within 200m that fail name matching, and assess if your name
matching is doing what you want or not.

It could also be that existing nodes are not as well placed as your
data, or the other way around.  If you think your data is better than
what's there, then updating the existing data is totally fine.  What
doesn't go over very well is taking a database of 1000 places that has
errors and coarse positions and overwriting the 20 places put in
accurately by hand.  I have no idea about your data quality - I could
equally imagine it being really good or pretty bad - but it's something
to think about.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20100128/8e201287/attachment.pgp>


More information about the Imports mailing list