[OSM-dev] history & disk space
Brett Henderson
brett at bretth.com
Thu Sep 27 16:53:44 BST 2007
Robert (Jamie) Munro wrote:
> Once osmosis is running, it won't just be a rare api call, it will be an
> hourly (or even more frequently) used db query.
>
> If you add a "current" column to the history tables, and put it at the
> start of some of the indexes, it may be possible to run all queries
> directly from the history tables, and the current tables become
> redundant. In postgres and other database systems, you can create a
> conditional index which will only index the current rows for this sort
> of purpose. You can use rules to ensure that only one record is current
> at a time. Also, you can use views to emulate hide this stuff from the
> queries and even make the view updatable with triggers, so the front end
> just makes updates as though there was no history at all, and the DB
> handles it all.
>
What he said :-)
At the moment the history tables provide a complete view of osm data at
any point in time. This solves several problems surrounding the
production of consistent snapshots and change sets. In theory you could
work around this by reading both current and history tables and merging
the results but without decent database transaction support this becomes
nigh impossible and exceptionally error prone. It will work in the vast
majority of cases when querying the history for a single entity but
osmosis will be at the mercy of constantly changing data while it's
performing reads of changes for all entities in the database.
It would be nice to avoid the current duplication of data but I don't
think avoiding writing to the history tables is a good solution.
Removing current tables may be the way to solve it. It also nicely
solves any issues with inconsistencies between current and history data.
I know somebody provided a link to an alternative schema on IRC
recently, I can't remember who it was. That schema merged history and
current into a single table.
More information about the dev
mailing list