On that note, I think there's a lot of scope for improving import speed in Osmosis.  Currently it does it all with multi-line SQL inserts.  PostgreSQL JDBC drivers now have COPY support and I have tried it out in the --fast-write-pgsql task.  It works well and is much faster.  There's a bit of work to add it to the apidb tasks though so I'm unlikely to attempt it any time soon.<br>

<br>As for negative ids, I always intended to add the ability to add new OSM data directly to a database but never got around to it.  Using the bulk uploader scripts are certainly the safest option on the production database.<br>

<br><div class="gmail_quote">On Wed, Aug 4, 2010 at 8:06 AM, Eric Wolf <span dir="ltr"><<a href="mailto:ebwolf@gmail.com">ebwolf@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Just killed the bulk_upload.py job, dropped database and recreated it.<div><br></div><div>Used sed to fix the negative numbers.<br><div><br></div><div>osmosis took 427263 milliseconds.</div><div><br></div><div>Yes. I did update the ID sequences in postgres.</div>


<div><br></div><div>Things are much happier without all that negativism. It's still very slow in Potlatch. At least part of the problem is the insane complexity of the features (yes, that straight road segment needs 82 nodes!)</div>


<div><br></div><font color="#888888"><div>-Eric</div></font><div><div class="im"><br clear="all">-=--=---=----=----=---=--=-=--=---=----=---=--=-=-<br>Eric B. Wolf                           720-334-7734<br><br><br><br>

<br><br></div><div><div></div><div class="h5"><div class="gmail_quote">On Tue, Aug 3, 2010 at 2:55 PM, Ian Dees <span dir="ltr"><<a href="mailto:ian.dees@gmail.com" target="_blank">ian.dees@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

I imagine the bottleneck is the Railsport doing precondition checks for everything as it's going in.<div><br></div><div>I don't think I could give an educated guess for time remaining, but on the <a href="http://api.osm.org" target="_blank">api.osm.org</a> server it usually takes 4+ hours to send in a 50k-change diff file (around 25MB?). Based on that I'd say you have at least half a day of waiting left.</div>


<div><br><div class="gmail_quote"><div><div></div><div>On Tue, Aug 3, 2010 at 3:46 PM, Eric Wolf <span dir="ltr"><<a href="mailto:ebwolf@gmail.com" target="_blank">ebwolf@gmail.com</a>></span> wrote:<br>

</div></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div><div></div><div>

Just how slow is bulk_upload.py?<div><br></div><div>I am loading a 177MB .osm file into an empty database on a quad 3.6Ghz Xeon with 6GB RAM and 700GB of RAID5. The machine is basically idle except for this load.</div><div>


<br></div><div>It's already taken almost an hour.</div><div><br></div><div>-Eric</div><div><br clear="all">-=--=---=----=----=---=--=-=--=---=----=---=--=-=-<br>Eric B. Wolf                           720-334-7734<br>


<br>

<br><br>

<br><br><div class="gmail_quote">On Tue, Aug 3, 2010 at 12:48 PM, andrzej zaborowski <span dir="ltr"><<a href="mailto:balrogg@gmail.com" target="_blank">balrogg@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">


<div>On 3 August 2010 20:28, Eric Wolf <<a href="mailto:ebwolf@gmail.com" target="_blank">ebwolf@gmail.com</a>> wrote:<br>

> This is in reference to the USGS OSMCP project - not the real OSM...<br>

> When we imported our chunk of data initially (not me - the guy responsible<br>

> is on walkabout in the Rockies), we followed the convention of using<br>

> negative IDs in the .OSM file. But osmosis was used to load the data into<br>

> the database and now all of our data has negative IDs. This seems to have a<br>

> really nasty effect on the API - every time something is edited, a new copy<br>

> is created with positive IDs and the old version with the negative IDs<br>

> persists.<br>

> I assume there is something in the API that says "negative IDs == BAD". I've<br>

> been trying to test that theory but keep hitting stumbling blocks. Postgres<br>

> doesn't seem to want to let me defer integrity constraints, so my efforts to<br>

> change a few IDs to positive values keeps failing. Maybe I've lost my SQL<br>

> chops (or maybe I just can't do that as the "openstreetmap" database user).<br>

> Am I barking up the right tree? Should I just go ahead and destroy the<br>

> database and repopulate it using bulk_upload.py instead of osmosis?<br>

<br>

</div>If there's no way disable the postgres contraints (I'm sure there is..<br>

but I'm a sql noob), I'd filter your .osm file through sed removing<br>

the '-' in 'ref="-'  and 'id="-' and reimport with osmosis, or modify<br>

your conversion script.  Using bulk_upload.py and the API will take<br>

ages.<br>

<br>

Cheers<br>

</blockquote></div><br></div>

<br></div></div><div>_______________________________________________<br>

dev mailing list<br>

<a href="mailto:dev@openstreetmap.org" target="_blank">dev@openstreetmap.org</a><br>

<a href="http://lists.openstreetmap.org/listinfo/dev" target="_blank">http://lists.openstreetmap.org/listinfo/dev</a><br>

<br></div></blockquote></div><br></div>

</blockquote></div><br></div></div></div></div>

<br>_______________________________________________<br>

dev mailing list<br>

<a href="mailto:dev@openstreetmap.org">dev@openstreetmap.org</a><br>

<a href="http://lists.openstreetmap.org/listinfo/dev" target="_blank">http://lists.openstreetmap.org/listinfo/dev</a><br>

<br></blockquote></div><br>