<div dir="ltr"><div><div><div><div><div>If you would go with adding ref. I'd use ref:xyz where xyz is something which identifies who's foreign keys you are using.<br><br></div>For the ones that are wrong in the external source but double checked, you could add<br><br></div>source=survey<br><br></div>or <br><br></div>note=resurveyed<br><br></div>Jo<br><div><div><div><div><div><br><br></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-01-10 16:44 GMT+01:00 Jason Remillard <span dir="ltr"><<a href="mailto:remillard.jason@gmail.com" target="_blank">remillard.jason@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Hi Wiktor,<br>

<br>

I don't think an address tag is needed or desirable.<br>

<br>

The best way of doing this is to compare versions of the official data<br>

(perhaps every 6 months), making a list of things that have changed so<br>

that they can be examined in OSM.<br>

<br>

Of coarse the big issue is that the matching is not trivial. First<br>

devise a matching score combining of distance to address, and edit<br>

distance in the address name and number. These scores are the weights.<br>

Then use one of the weighted bipartite graph matching algorithm<br>

(augmented path) that works well on sparse data. If you keep the<br>

search radius down, the graph will be very sparse, so should be<br>

manageable. Using the match, you can get a list of nodes that have<br>

been moved, deleted, and edited in the official data set.<br>

<br>

Jason<br>

<div><div class="h5"><br>

On Sat, Jan 10, 2015 at 4:59 AM, Wiktor Niesiobedzki <<a href="mailto:osm@vink.pl">osm@vink.pl</a>> wrote:<br>

> Hi,<br>

><br>

> In Poland we have quite a few addresses imported from government<br>

> sources for quite long time, but as time goes on, changes are made to<br>

> the source databases, and local communities don't have any viable<br>

> tools, to track, what has changed in source. In case of city of<br>

> Skarżysko-Kamienna, local mapper tried hard to track all the changes<br>

> in source (as well as check this on site), but still, missed a lot of<br>

> changes, and as it's now - there is no tooling to help such users.<br>

><br>

> What I'd like to do, is to prepare a service, that will generate<br>

> changes for OSM containing differences for each municipality, so local<br>

> mapper can load, review and decide what to import.<br>

><br>

> But this tool, to be efficient, needs additional information to be<br>

> stored in OSM - identifier of the object in the source database, for<br>

> which i propose tag: ref:addr.<br>

><br>

> This tag is used for both identifying what was already imported, as<br>

> well as, I'd like to create a protocol, that if there are some "wrong"<br>

> data in the import source, we would leave a point in OSM containing:<br>

> addr:ref<br>

> source:addr<br>

><br>

> So we can instruct further imports, to skip this point, unless there<br>

> will be some change in source data.<br>

><br>

> I find this solution most robust, as it gives great Signal-to-Noise<br>

> ratio for local mappers, when they are identifying what needs to be<br>

> updated, as well as, gives as resilience when someone accidentally<br>

> deletes some address.<br>

><br>

> In Poland there thousands of people employed by government to keep<br>

> this data in good quality and using OSM community to duplicate their<br>

> work is in my opinion - wasteful. Using this method, we can use their<br>

> work, and use OSM community to improve the data, that government is<br>

> sourcing. And this is something we should consider for all of the<br>

> imports.<br>

><br>

> We had some discussion about this already in Polish community, but as<br>

> it seems, it might be philosophical change for this project, I'd like<br>

> to raise this issue on global level.<br>

><br>

> Apart from addresses I plan to start importing national heritage<br>

> objects, for which I see exactly the same problem.<br>

><br>

> The other solution that we discussed in our community is to keep track<br>

> of import source state in separate database, and use this, to see what<br>

> has changed in source, to generate files for local mappers, but I see<br>

> following disadvantages of such solution:<br>

> - such solution doesn't take into account current state of objects in<br>

> OSM, what may generate duplicates or miss data, that were accidentally<br>

> deleted<br>

> - it makes harder to fork OSM project, as you need to fork two<br>

> databases, know about them, and the license for such database should<br>

> be open<br>

> - it still needs some "protocol" to this database, to mark that import<br>

> was done (and in what extent) - it would require additional tooling<br>

> and might be additional problem to causual mappers, and probably would<br>

> render the tool unusable<br>

> - it gives no tools for integrity with OSM databases<br>

> - needs additional support<br>

><br>

><br>

> The disadvantages of my solution, that I found most concerning were:<br>

> - nodes contaning only ref:addr and source:addr might be hard to<br>

> understand by newcomers, especially that ref:addr doesn't contain any<br>

> human-understandable data<br>

> - ref:addr might get clobbered during merge of nodes<br>

><br>

> But I hope that with extensive description on Wiki we can handle that problems.<br>

><br>

> Cheers,<br>

><br>

> Wiktor Niesiobędzki<br>

><br>

> _______________________________________________<br>

> talk mailing list<br>

> <a href="mailto:talk@openstreetmap.org">talk@openstreetmap.org</a><br>

> <a href="https://lists.openstreetmap.org/listinfo/talk" target="_blank">https://lists.openstreetmap.org/listinfo/talk</a><br>

<br>

</div></div>_______________________________________________<br>

Imports mailing list<br>

<a href="mailto:Imports@openstreetmap.org">Imports@openstreetmap.org</a><br>

<a href="https://lists.openstreetmap.org/listinfo/imports" target="_blank">https://lists.openstreetmap.org/listinfo/imports</a><br>

</blockquote></div><br></div>