[OSM-dev] How to clean up Negative IDs

Brett Henderson brett at bretth.com
Wed Aug 4 01:48:07 BST 2010


On that note, I think there's a lot of scope for improving import speed in
Osmosis.  Currently it does it all with multi-line SQL inserts.  PostgreSQL
JDBC drivers now have COPY support and I have tried it out in the
--fast-write-pgsql task.  It works well and is much faster.  There's a bit
of work to add it to the apidb tasks though so I'm unlikely to attempt it
any time soon.

As for negative ids, I always intended to add the ability to add new OSM
data directly to a database but never got around to it.  Using the bulk
uploader scripts are certainly the safest option on the production database.

On Wed, Aug 4, 2010 at 8:06 AM, Eric Wolf <ebwolf at gmail.com> wrote:

> Just killed the bulk_upload.py job, dropped database and recreated it.
>
> Used sed to fix the negative numbers.
>
> osmosis took 427263 milliseconds.
>
> Yes. I did update the ID sequences in postgres.
>
> Things are much happier without all that negativism. It's still very slow
> in Potlatch. At least part of the problem is the insane complexity of the
> features (yes, that straight road segment needs 82 nodes!)
>
> -Eric
>
> -=--=---=----=----=---=--=-=--=---=----=---=--=-=-
> Eric B. Wolf                           720-334-7734
>
>
>
>
>
> On Tue, Aug 3, 2010 at 2:55 PM, Ian Dees <ian.dees at gmail.com> wrote:
>
>> I imagine the bottleneck is the Railsport doing precondition checks for
>> everything as it's going in.
>>
>> I don't think I could give an educated guess for time remaining, but on
>> the api.osm.org server it usually takes 4+ hours to send in a 50k-change
>> diff file (around 25MB?). Based on that I'd say you have at least half a day
>> of waiting left.
>>
>> On Tue, Aug 3, 2010 at 3:46 PM, Eric Wolf <ebwolf at gmail.com> wrote:
>>
>>> Just how slow is bulk_upload.py?
>>>
>>> I am loading a 177MB .osm file into an empty database on a quad 3.6Ghz
>>> Xeon with 6GB RAM and 700GB of RAID5. The machine is basically idle except
>>> for this load.
>>>
>>> It's already taken almost an hour.
>>>
>>> -Eric
>>>
>>> -=--=---=----=----=---=--=-=--=---=----=---=--=-=-
>>> Eric B. Wolf                           720-334-7734
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Aug 3, 2010 at 12:48 PM, andrzej zaborowski <balrogg at gmail.com>wrote:
>>>
>>>> On 3 August 2010 20:28, Eric Wolf <ebwolf at gmail.com> wrote:
>>>> > This is in reference to the USGS OSMCP project - not the real OSM...
>>>> > When we imported our chunk of data initially (not me - the guy
>>>> responsible
>>>> > is on walkabout in the Rockies), we followed the convention of using
>>>> > negative IDs in the .OSM file. But osmosis was used to load the data
>>>> into
>>>> > the database and now all of our data has negative IDs. This seems to
>>>> have a
>>>> > really nasty effect on the API - every time something is edited, a new
>>>> copy
>>>> > is created with positive IDs and the old version with the negative IDs
>>>> > persists.
>>>> > I assume there is something in the API that says "negative IDs ==
>>>> BAD". I've
>>>> > been trying to test that theory but keep hitting stumbling blocks.
>>>> Postgres
>>>> > doesn't seem to want to let me defer integrity constraints, so my
>>>> efforts to
>>>> > change a few IDs to positive values keeps failing. Maybe I've lost my
>>>> SQL
>>>> > chops (or maybe I just can't do that as the "openstreetmap" database
>>>> user).
>>>> > Am I barking up the right tree? Should I just go ahead and destroy the
>>>> > database and repopulate it using bulk_upload.py instead of osmosis?
>>>>
>>>> If there's no way disable the postgres contraints (I'm sure there is..
>>>> but I'm a sql noob), I'd filter your .osm file through sed removing
>>>> the '-' in 'ref="-'  and 'id="-' and reimport with osmosis, or modify
>>>> your conversion script.  Using bulk_upload.py and the API will take
>>>> ages.
>>>>
>>>> Cheers
>>>>
>>>
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openstreetmap.org
>>> http://lists.openstreetmap.org/listinfo/dev
>>>
>>>
>>
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20100804/5165a98d/attachment.html>


More information about the dev mailing list