[OSM-dev] Osmosis questions about applying changesets
Brett Henderson
brett at bretth.com
Wed Dec 8 08:51:11 GMT 2010
On Wed, Dec 8, 2010 at 4:39 AM, Andreas Kalsch <andreaskalsch at gmx.de> wrote:
> Am 07.12.10 14:11, schrieb Andreas Kalsch:
>
> 1) What does the data_type "U" in the actions table mean?
>>
> Of course: users, but 2 and 3 are still not clear to me.
>
> 2) It seems that Osmosis violates pk_aktions (primary key), so it would be
>> better to replace it by a simple index.
>>
>
The action table supports only a single action per entity. For example, if
you've just added a node with id 1,000,000, if you then try to modify the
same node a second entry will be created in the table and the primary key
will be violated.
The reason you're seeing this is that you're feeding the output of
--read-replication-interval directly into the --write-pgsimp-change task.
The --read-replication-interval can produce multiple changes for a single
entity because it is a full history task (some consumers wish to know all
changes, not just the last one.
To avoid this, add a --simplify-change task after
--read-replication-interval and before the --write-pgsimp-change task. It
will collapse multiple changes for a single entity (ie. several changes to
one entity with different version numbers) into a single change record.
This will prevent the actions table primary key from being violated.
> 3) This command does not seem to retrieve all changesets. After call, the
>> latest timestamp was 20.11.2010.
>>
>> ${OSM_DIRNAME_OSMOSIS}bin/osmosis \
>> --read-replication-interval "$OSM_DIRNAME_REPLICATION" \
>> --write-pgsimp-change database="$OSM_DB_NAME"
>> validateSchemaVersion=no user="$OSM_DB_USER" password="$OSM_DB_PW"
>> host="$OSM_DB_HOST"
>>
>> My server's time is correct.
>>
>
You need to be aware of several points when consuming replication data:
1. The amount of data retrieved by a single Osmosis execution is
limited. The --read-replication-interval task retrieves a number of change
files bounded by the maxDownloadCount parameter in the config file.
2. Change files contain data sequenced by database transaction
identifiers, not timestamp. It is possible for data to be retrieved out of
chronological order, but it will always be ordered correctly from a
transactional standpoint.
I suspect that in your case you simply need to run your Osmosis command
multiple times to get all of the data. Alternatively increase the
maxDownloadCount parameter but keep in mind that if something goes wrong
this will cause the entire transaction to be rolled back forcing you to
download all files again. The default of 20 is not a bad place to start.
>> 4) I think I am right that action "M" means insert?
>>
>
The actions are "C" for create, "M" for modify, and "D" for delete. The
create or modify are determined based on what is currently in the database,
not what is supplied in the incoming change data. So you should be able to
rely on it for updating derivative tables.
Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20101208/4656156d/attachment-0001.html>
More information about the dev
mailing list