<div class="gmail_quote">On Tue, May 4, 2010 at 10:50 PM, Brett Henderson <span dir="ltr"><<a href="mailto:brett@bretth.com">brett@bretth.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div class="im">On Tue, May 4, 2010 at 5:56 PM, Ibrahim Bouchrika <span dir="ltr"><<a href="mailto:ibrahim_bouchrika@hotmail.com" target="_blank">ibrahim_bouchrika@hotmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>
The database i'm using is a fresh made one, so the error must be duplicate entries in the osm. I have the same problem with a duplicate relation in the new york city file i extracted from a cloudmade map.<br></div></blockquote>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><br>Could the duplicate values be a result of a bug while extracting a bbox with osmosis?<br>
</div></blockquote></div><div><br>I doubt it, extracting a bbox should never create multiple instances of the same id. It's more likely to be caused by incorrect use of the --apply-change task when applying minute or hourly diff files. It was possible with older versions of osmosis to apply a full history diff containing multiple changes for a single entity which would result in multiple versions of the same entity being written to the result file.<br>
</div><div class="im"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><br>Is there a way around those duplicate entries, like perhaps overwriting duplicate data? Going through all those entries manually seems almost impossible.<br>
</div></blockquote></div><div><br>Hmm, I'm not sure about the best way to tackle this ...<br></div></div><br>I seem to remember Frederik having a clever way of removing duplicates. I think he did something like the following:<br>
1. Convert the entire file to an osc file by wrapping the entire contents of the file in osmChange element with an action of create or modify.<br>2. Feed the osc file through the --simplify-change task.<br>3. Create an empty osm file, then use the --apply-change task to add the entire change file to it.<br>
</blockquote><div><br>Found it. His steps were:<br><br><quote><br>Just for laughs - and wonder if you can come up with something better?
- this is what I did to remove the duplicate objects from an existing
OSM file named "faulty.osm":<br>
<br>
echo "<osm></osm>" > empty.osm<br>
<br>
osmosis --rx faulty.osm --rx empty.osm --dc --sort-change-0.6 --<span class="il">simc</span> --wxc good.osc<br>
<br>
osmosis --rxc good.osc --rx empty.osm --ac --wx good.osm<br>
<br>
(The two osmosis steps could be written as one but that would probably make it more confusing.)<br></quote><br><br>Can you try these steps on your cloudmade file and see if that fixes the problem? If it does, Cloudmade has a problem with duplicate data in their files. If it doesn't fix it, then we'll have to dig further :-)<br>
<br>Brett<br></div></div><br>