<div dir="ltr"><div><div><div><div><div>If you would go with adding ref. I'd use ref:xyz where xyz is something which identifies who's foreign keys you are using.<br><br></div>For the ones that are wrong in the external source but double checked, you could add<br><br></div>source=survey<br><br></div>or <br><br></div>note=resurveyed<br><br></div>Jo<br><div><div><div><div><div><br><br></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-01-10 16:44 GMT+01:00 Jason Remillard <span dir="ltr"><<a href="mailto:remillard.jason@gmail.com" target="_blank">remillard.jason@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Hi Wiktor,<br>
<br>
I don't think an address tag is needed or desirable.<br>
<br>
The best way of doing this is to compare versions of the official data<br>
(perhaps every 6 months), making a list of things that have changed so<br>
that they can be examined in OSM.<br>
<br>
Of coarse the big issue is that the matching is not trivial. First<br>
devise a matching score combining of distance to address, and edit<br>
distance in the address name and number. These scores are the weights.<br>
Then use one of the weighted bipartite graph matching algorithm<br>
(augmented path) that works well on sparse data. If you keep the<br>
search radius down, the graph will be very sparse, so should be<br>
manageable. Using the match, you can get a list of nodes that have<br>
been moved, deleted, and edited in the official data set.<br>
<br>
Jason<br>
<div><div class="h5"><br>
On Sat, Jan 10, 2015 at 4:59 AM, Wiktor Niesiobedzki <<a href="mailto:osm@vink.pl">osm@vink.pl</a>> wrote:<br>
> Hi,<br>
><br>
> In Poland we have quite a few addresses imported from government<br>
> sources for quite long time, but as time goes on, changes are made to<br>
> the source databases, and local communities don't have any viable<br>
> tools, to track, what has changed in source. In case of city of<br>
> Skarżysko-Kamienna, local mapper tried hard to track all the changes<br>
> in source (as well as check this on site), but still, missed a lot of<br>
> changes, and as it's now - there is no tooling to help such users.<br>
><br>
> What I'd like to do, is to prepare a service, that will generate<br>
> changes for OSM containing differences for each municipality, so local<br>
> mapper can load, review and decide what to import.<br>
><br>
> But this tool, to be efficient, needs additional information to be<br>
> stored in OSM - identifier of the object in the source database, for<br>
> which i propose tag: ref:addr.<br>
><br>
> This tag is used for both identifying what was already imported, as<br>
> well as, I'd like to create a protocol, that if there are some "wrong"<br>
> data in the import source, we would leave a point in OSM containing:<br>
> addr:ref<br>
> source:addr<br>
><br>
> So we can instruct further imports, to skip this point, unless there<br>
> will be some change in source data.<br>
><br>
> I find this solution most robust, as it gives great Signal-to-Noise<br>
> ratio for local mappers, when they are identifying what needs to be<br>
> updated, as well as, gives as resilience when someone accidentally<br>
> deletes some address.<br>
><br>
> In Poland there thousands of people employed by government to keep<br>
> this data in good quality and using OSM community to duplicate their<br>
> work is in my opinion - wasteful. Using this method, we can use their<br>
> work, and use OSM community to improve the data, that government is<br>
> sourcing. And this is something we should consider for all of the<br>
> imports.<br>
><br>
> We had some discussion about this already in Polish community, but as<br>
> it seems, it might be philosophical change for this project, I'd like<br>
> to raise this issue on global level.<br>
><br>
> Apart from addresses I plan to start importing national heritage<br>
> objects, for which I see exactly the same problem.<br>
><br>
> The other solution that we discussed in our community is to keep track<br>
> of import source state in separate database, and use this, to see what<br>
> has changed in source, to generate files for local mappers, but I see<br>
> following disadvantages of such solution:<br>
> - such solution doesn't take into account current state of objects in<br>
> OSM, what may generate duplicates or miss data, that were accidentally<br>
> deleted<br>
> - it makes harder to fork OSM project, as you need to fork two<br>
> databases, know about them, and the license for such database should<br>
> be open<br>
> - it still needs some "protocol" to this database, to mark that import<br>
> was done (and in what extent) - it would require additional tooling<br>
> and might be additional problem to causual mappers, and probably would<br>
> render the tool unusable<br>
> - it gives no tools for integrity with OSM databases<br>
> - needs additional support<br>
><br>
><br>
> The disadvantages of my solution, that I found most concerning were:<br>
> - nodes contaning only ref:addr and source:addr might be hard to<br>
> understand by newcomers, especially that ref:addr doesn't contain any<br>
> human-understandable data<br>
> - ref:addr might get clobbered during merge of nodes<br>
><br>
> But I hope that with extensive description on Wiki we can handle that problems.<br>
><br>
> Cheers,<br>
><br>
> Wiktor Niesiobędzki<br>
><br>
> _______________________________________________<br>
> talk mailing list<br>
> <a href="mailto:talk@openstreetmap.org">talk@openstreetmap.org</a><br>
> <a href="https://lists.openstreetmap.org/listinfo/talk" target="_blank">https://lists.openstreetmap.org/listinfo/talk</a><br>
<br>
</div></div>_______________________________________________<br>
Imports mailing list<br>
<a href="mailto:Imports@openstreetmap.org">Imports@openstreetmap.org</a><br>
<a href="https://lists.openstreetmap.org/listinfo/imports" target="_blank">https://lists.openstreetmap.org/listinfo/imports</a><br>
</blockquote></div><br></div>