[osmosis-dev] Duplicate ways in pgsnapshot database

Thu Jan 31 01:47:50 GMT 2013

Today my minutely replication started failing with a unique constraint
violation error from postgres. Upon further investigation I found that
there were *already* two copies of a way in my database. An incoming
change was trying to modify the way which caused postgres to notice
the duplication and error out. Basically a "hey wait there are two of
them. Which one do you want me to modify?" Here is the osmosis output:

Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key
value violates unique constraint "pk_ways"
  Detail: Key (id)=(26926573) already exists.

It was erroring on this way:
http://www.openstreetmap.org/browse/way/26926573/history

So a few questions immediately come to mind.

1) How did a duplicate record get into the database? There is
definitely a primary key constraint on the id column. In this
particular case it looks like it happened during the initial planet
import. I did this from the January 2nd pbf file. The two rows are
identical in every way and the way was last touched (before today's
edit) in 2009. All constraints are disabled during the \copy operation
so I can see a duplicate way being able to get in. Although this
implies that there are either two copies of the way in the planet file
or a bug in osmosis. I would have thought the primary key constraint
would have been checked when it was recreated after the \copy
operation though. Apparently not.

2) How do I fix this? I believe deleting one of the rows would fix
this but I can't actually delete only one since *every* column is the
same. I think it was suggested on #osm-dev that I create a copy of one
in temp table, delete both and then reinsert the copy. This is
probably what I will try.

3) Are there any others? Turns out: yes, there are 4 duplicated ways
in my database. This may not come through with good formatting but
here they are:
    id    | version | user_id |       tstamp
----------+---------+---------+---------------------
 26245218 |      12 |  163673 | 2011-02-06 06:54:10
 26245218 |      13 |  290680 | 2013-01-28 02:37:56
 26709186 |       4 |   64721 | 2008-09-02 04:39:21
 26709186 |       4 |   64721 | 2008-09-02 04:39:21
 26709284 |       4 |   70621 | 2008-10-26 14:06:03
 26709284 |       5 |   64721 | 2013-01-28 02:38:30
 26926573 |       4 |  118011 | 2009-12-27 07:13:28
 26926573 |       4 |  118011 | 2009-12-27 07:13:28

A couple of interesting things here.
- Two of them have identical duplicates (26709186 and 26926573). These
can both be explained by an error in the planet file or import
process.
- The other two however are not the same and both of them must have
been created during diff application because it happened 2 days ago -
within 10 seconds of each other. It is possible that there were
duplicates of these ways as well and for some reason they just didn't
hit this error during diff application and one of the records was
successfully updated.

Soo... wtf? Does Does anyone have ideas about how postgres' primary
key check could be circumvented? Is my theory about the \copy getting
around it during import feasible? But what about the ones created
during diff processing? Looking at my system monitoring I don't see
anything unusual going on 2 days ago. I've been having problems with X
on this machine but that won't affect postgres and osmosis is running
inside of screen. Soo... yeah. Anything? :)

Toby