[OSM-dev] minute diff - max delay

Sun Sep 12 12:28:12 BST 2010

On Sat, Aug 14, 2010 at 11:38 AM, Brett Henderson <brett at bretth.com> wrote:

> On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes <tom at compton.nu> wrote:
>
>> On 14/08/10 00:19, Grant Slater wrote:
>>
>>> On 14 August 2010 00:10, Brett Henderson<brett at bretth.com>  wrote:
>>>
>>>>
>>>> Is anybody aware of anything that happened on that day *other* than the
>>>> database upgrade?  Any new imports, etc.
>>>>
>>>>
>>> The database was fully re-imported (planned and triple backed up) and
>>> the transaction IDs were reset due to this.
>>> zere was able to set the transaction id used by osmosis diff export
>>> because I believe you were not around or weren't available at the
>>> time.
>>>
>>> Also: Postgresql 8.3 ->  8.4. RAID10 on 10 disk to RAID 10 on 16 disks.
>>> RAID stripe size changed from 256KB to 64KB.
>>>
>>
>> There's not really any great mystery here, we know it was the upgrade to
>> postgres 8.4 (or just as likely the reimport of the db) that triggered it.
>>
>
> Okay.  I didn't realise that a database upgrade had occurred, I thought it
> was only disk/RAID changes.
>
>
>>
>> We just need to get to the bottom of what is making some of the queries
>> run slowly, but it's not a very easy thing to do.
>>
>
> Is it only Osmosis queries that are running slowly?
>
>
>> My assumption was that it was choosing a bad execution plan as the way our
>> schema works tends to confuse Postgres's statistics, but the plan I looked
>> at didn't show any sign of that.
>>
>> Equally it doesn't seem to be a lock contention issue.
>>
>
> Is there anything I can add that might make it easier to investigate such
> as additional query options, log query timings, etc?  I'm not sure what to
> try at this point.  About the only thing I can think to do is to set up a
> local database and try to replicate the problem.  I've been meaning to do
> that but it's not a quick task and I haven't had much time to spend on it.
>

I've just upgraded Osmosis from the 0.35 release to the current 0.37
snapshot.  I've introduced a relatively minor change that on initial testing
appears to have fixed the problem.  I create a number of temp tables during
replication processing to hold identifiers (actually id and version) of each
of nodes, ways and relations.  I am now adding a primary key to those tables
which should assist the query planner come up with a more effective query
plan.  I'm not sure why I didn't do that originally ... perhaps I just
missed it.

I'm a bit surprised that it has fixed it given that the amount of data in
the temp tables is relatively small and query analysis wasn't pointing at
poor query plans, but it seems to be running *much* faster now.

The new version took effect from replication number 906 onwards, so if
anybody sees any issues please let me know.

Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20100912/795a1bea/attachment.html>