<div dir="auto">Many thanks for putting some numbers on this.<div dir="auto"><br></div><div dir="auto">Warin's comment would suggest it may also be more than just buildings that are involved.</div><div dir="auto"><br></div><div dir="auto">For buildings the total number as a percentage is small unfortunately they tend to cluster so are more of a problem than if they were more spread out.</div><div dir="auto"><br></div><div dir="auto">John</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Mar 11, 2023, 07:40 Frederik Ramm <<a href="mailto:frederik@remote.org" target="_blank" rel="noreferrer">frederik@remote.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
I think an automatic fix of the problem is possible, however it would be <br>
a good idea to try and find out what the root cause of the problem is - <br>
bad software, bad imports, bad instructions?<br>
<br>
To get an idea of how big the issue is, I did this on a standard <br>
rendering database:<br>
<br>
create table buildings as (select way,osm_id from planet_osm_polygon <br>
where building is not null)<br>
<br>
select a.osm_id, b.osm_id into duplicates from buildings a, buildings b <br>
where a.osm_id < b.osm_id and a.way ~= b.way and st_equals(a.way,b.way);<br>
<br>
This took a few days - probably it could have been done more efficiently <br>
- and resulted in a list of about 70k buldings world-wide that are exact <br>
duplicates (geoetry-wise) of other buildings. The list is here:<br>
<br>
<a href="http://www.remote.org/frederik/tmp/duplicatebuildings.csv" rel="noreferrer noreferrer noreferrer" target="_blank">http://www.remote.org/frederik/tmp/duplicatebuildings.csv</a><br>
<br>
Some buildings are in OSM three or four times (contained i nthe above in <br>
the form of "a is duplicate of b, b is duplicate of c") but I've <br>
extracted them in extra files: <br>
<a href="http://www.remote.org/frederik/tmp/triplcatebuildings.csv" rel="noreferrer noreferrer noreferrer" target="_blank">http://www.remote.org/frederik/tmp/triplcatebuildings.csv</a> and <br>
<a href="http://www.remote.org/frederik/tmp/quadruplicatebuildings.csv" rel="noreferrer noreferrer noreferrer" target="_blank">http://www.remote.org/frederik/tmp/quadruplicatebuildings.csv</a>)<br>
<br>
I don't have the time to analyse the situation in more detail at present <br>
so if anyone wants to take the above lists as a basis for deeper analysis...<br>
<br>
Cheers<br>
Frederik<br>
<br>
-- <br>
Frederik Ramm ## eMail <a href="mailto:frederik@remote.org" rel="noreferrer noreferrer" target="_blank">frederik@remote.org</a> ## N49°00'09" E008°23'33"<br>
<br>
_______________________________________________<br>
talk mailing list<br>
<a href="mailto:talk@openstreetmap.org" rel="noreferrer noreferrer" target="_blank">talk@openstreetmap.org</a><br>
<a href="https://lists.openstreetmap.org/listinfo/talk" rel="noreferrer noreferrer noreferrer" target="_blank">https://lists.openstreetmap.org/listinfo/talk</a><br>
</blockquote></div>