<div dir="ltr"><div><div><div>I'm fairly sure the problem is random bit-flips when writing large files.<br></div><br>I've re-run the command:<br>osmosis --read-pbf-fast file=/pg_xlog/backup/planet-latest.osm.pbf workers=4 --write-pgsql-dump directory=/user/osm<br><br>I ended up with junk between nodes 1884827207 and 1884827211 inclusive.<br><br></div><div>Then I ran the identical command, same file and I have managed to extract the nodes that were previously destroyed. There are no special characters that may have caused UTF8 parsing issues and the nodes are all there intact.<br></div><div><br></div>Wow, this has cost me perhaps a week! Perhaps time to upgrade the RAID array, which is well within support.<br><br></div><div>Best regards,<br><br></div><div>Will<br></div><div><br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 31 March 2015 at 19:34, Paul Norman <span dir="ltr"><<a href="mailto:penorman@mac.com" target="_blank">penorman@mac.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 3/31/2015 4:13 AM, William Temperley wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear all,<br>
<br>
I wonder if someone could point me to a recent version of planet.osm that has been loaded successfully with Osmosis into the postgis snapshot schema, please?<br>
<br>
The previous two versions (planet-150316.osm.bz2 and planet-150323.osm.bz2) are giving me an error:<br>
"<a href="http://org.apache.xerces.impl.io" target="_blank">org.apache.xerces.impl.io</a>.<u></u>MalformedByteSequenceException<u></u>: Invalid byte 2 of 4-byte UTF-8 sequence."<br>
</blockquote></div></div>
I'm downloading one of those to check, but are you sure it's a problem with the planet file and not Osmosis? Did you check the md5sum of what you downloaded?<br>
<br>
In either case, you should not be using the bzipped XML but should instead use a PBF, which is much faster to process.<br>
<br>
______________________________<u></u>_________________<br>
dev mailing list<br>
<a href="mailto:dev@openstreetmap.org" target="_blank">dev@openstreetmap.org</a><br>
<a href="https://lists.openstreetmap.org/listinfo/dev" target="_blank">https://lists.openstreetmap.<u></u>org/listinfo/dev</a><br>
</blockquote></div><br></div>